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Abstract 

Logic languages based on the theory of rational, possibly infinite, trees have much 
appeal in that rational trees allow for faster unification (due to the safe omission of 
the occurs-check) and increased expressivity (cyclic terms can provide very efficient 
representations of grammars and other useful objects). Unfortunately, the use of 
infinite rational trees has problems. For instance, many of the built-in and library 
predicates are ill-defined for such trees and need to be supplemented by run-time 
checks whose cost may be significant. Moreover, some widely-used program analysis 
and manipulation techniques are correct only for those parts of programs working 
over finite trees. It is thus important to obtain, automatically, a knowledge of the 
program variables (the finite variables) that, at the program points of interest, will 
always be bound to finite terms. For these reasons, we propose here a new data- 
flow analysis, based on abstract interpretation, that captures such information. We 
present a parametric domain where a simple component for recording finite variables 
is coupled, in the style of the open product construction of Cortesi et al., with a 
generic domain (the parameter of the construction) providing sharing information. 
The sharing domain is abstractly specified so as to guarantee the correctness of 
the combined domain and the generality of the approach. This finite-tree analysis 
domain is further enhanced by coupling it with a domain of Boolean functions, called 
finite-tree dependencies, that precisely captures how the finiteness of some variables 
influences the finiteness of other variables. We also summarize our experimental 
results showing how finite-tree analysis, enhanced with finite-tree dependencies, is 
a practical means of obtaining precise finiteness information. 

Key words: static analysis, abstract interpretation, rational unification, 
occurs-check 
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1 Introduction 



The intended computation domain of most logic-based languages^ includes 
the algebra (or structure) of finite trees. Other (constraint) logic-based lan- 
guages, such as Prolog II and its successors [1,2], SICStus Prolog [3], and Oz 
[4], refer to a computation domain of rational trees. ^ A rational tree is a pos- 
sibly infinite tree with a finite number of distinct subtrees and where each 
node has a finite number of immediate descendants. These properties ensure 
that rational trees, even though infinite in the sense that they admit paths of 
infinite length, can be finitely represented. One possible representation makes 
use of connected, rooted, directed and possibly cyclic graphs where nodes are 
labeled with variable and function symbols as is the case of finite trees. 

Applications of rational trees in logic programming include graphics [6] , parser 
generation and grammar manipulation [1,7], and computing with finite-state 
automata [1]. Rational trees also constitute the basis of the abstract domain 
of rigid type graphs, which is used for type analysis of logic programs [8,9,10]. 
Other applications are described in [11] and [12]. Very recently, Manuel Carro 
has described a nice application of rational trees where they are used to repre- 
sent imperative programs within interpreters. Taking a continuation-passing 
style approach, each instruction is coupled with a data structure representing 
the remaining part of the program to be executed so that sequences of instruc- 
tions for realizing (backward) jumps, iterations and recursive calls give rise to 
cychc structures in the form of rational trees. Compared to a naive interpreter 
for the same language, this threaded interpreter is faster and uses less memory, 
at the cost of a simple preliminary "compilation pass" to generate the rational 
tree representation for the program [13]. 

Going from Prolog to CLP, in [14] K. Mukai has combined constraints on 
rational trees and record structures, while the logic-based language Oz allows 
constraints over rational and feature trees [4]. The expressive power of rational 
trees is put to use, for instance, in several areas of natural language processing. 



* This work has been partly supported by MURST projects "Automatic Program 
Certification by Abstract Interpretation", "Abstract Interpretation, Type Systems 
and Control-Flow Analysis", and "Constraint Based Verification of Reactive Sys- 
tems." Some of this work was done during visits of the fourth author to Leeds, 
funded by EPSRC under grant M05645. 

Email addresses: bagnara@cs.unipr.it (Roberto Bagnara), 
gori@di.unipi.it (Roberta Gori), hill@coinp.leeds.ac.uk (Patricia M. Hill), 
zaffanella@cs.unipr.it (Enea Zaffanella). 

^ That is, ordinary logic languages, (concurrent) constraint logic languages, func- 
tional logic languages and variations of the above. 

^ Support for rational trees is also provided as an option by the YAP Prolog system 
[5]. 



2 



Rational trees arc Tiscd in implementations of the HPSG formalism (Head- 
driven Phrase Structure Grammar) [15], in the ALE system (Attribute Logic 
Engine) [16], and in the ProFIT system (Prolog with Features, Inheritance 
and Templates) [17]. 

While rational trees allow for increased expressivity, they also come equipped 
with a surprising number of problems. As we will see, some of these problems 
are so serious that rational trees must be used in a very controlled way, disal- 
lowing them in any context where they are "dangerous." This, in turn, causes 
a secondary problem: in order to disallow rational trees in selected contexts 
one must first detect them, an operation that may be expensive. 

The first thing to be aware of is that almost any semantics-based program ma- 
nipulation technique developed in the field of logic programming — whether it 
be an analysis, a transformation, or an optimization — assumes a computation 
domain of finite trees. Some of these techniques might work with rational trees 
but their correctness has only been proved in the case of finite trees. Others 
are clearly inapplicable. Let us consider a very simple Prolog program: 

list([]). 

list([_|T]) :- list(T). 

Most automatic and semi-automatic tools for proving program termination ^ 
and for complexity analysis^ agree on the fact that list /I will terminate 
when invoked with a ground argument. Consider now the query 

?- X = [alX] , list(X) . 

and note that, after the execution of the first rational unification, the variable 
X will be bound to a rational term containing no variables, i.e., the predicate 
list/1 will be invoked with X ground. However, if such a query is given 
to, say, SICStus Prolog, then the only way to get the prompt back is by 
interrupting the program. The problem stems from the fact that the analysis 
techniques employed by these tools are only sound for finite trees: as soon as 
they are applied to a system where the creation of cyclic terms is possible, their 
results are inapplicable. The situation can be improved by combining these 
termination and/or complexity analyses with a finiteness analysis providing 
the precondition for the applicability of the other techniques. 

The implementation of built-in predicates is another problematic issue. In- 
deed, it is widely acknowledged that, for the implementation of a system that 
provides real support for rational trees, the biggest effort concerns proper han- 

3 Such as TerminWeb [18,19], TermiLog [20], cTI [21], and LPTP [22,23]. 

^ Systems like GAIA [24], CASLOG [25], and the Ciao-Prolog preprocessor [26]. 
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dling of built-ins. Of course, the meaning of 'proper' depends on the actual 
built-in. Built-ins such as copy_term/2 and ==/2 maintain a clear semantics 
when passing from finite to rational trees. For others, like sort/2, the exten- 
sion can be questionable: ^ failing, raising an exception, answering Y = [a] 
(if duplicates are deleted) and answering Y = [a I Y] (if duplicates are kept) 
can all be argued to be "the right reaction" to the query 

?- X = [alX] , sortCX, Y) . 

Other built-ins do not tolerate infinite trees in some argument positions. A 
good implementation should check for finiteness of the corresponding argu- 
ments and make sure "the right thing" — failing or raising an appropriate 
exception — always happens. However, such behavior appears to be uncom- 
mon. A small experiment we conducted on six Prolog implementations with 
queries like 

?- X = 1+X, Y is X. 

?- X = [97 IX] , nameCY, X) . 

?- X = [XIX], Y = . . [fix]. 

resulted in infinite loops, memory exhaustion and/or system thrashing, seg- 
mentation faults or other fatal errors. One of the implementations tested, 
SICStus Prolog, is a professional one and implements run-time checks to avoid 
most cases where built-ins can have catastrophic effects. ^ The remaining sys- 
tems are a bit more than research prototypes, but will clearly have to do the 
same if they evolve to the stage of production tools. Again, a data-fiow analy- 
sis aimed at the detection of those variables that are definitely bound to finite 
terms could be used to avoid a (possibly significant) fraction of the useless 
run-time checks. Note that what has been said for built-in predicates applies 
to libraries as well. Even though it may be argued that it is enough for pro- 
grammers to know that they should not use a particular library predicate with 
infinite terms, it is clear that the use of a "safe" library, including automatic 
checks ensuring that such a predicate is never called with an illegal argument, 
will result in a robuster system. With the appropriate data-fiow analyses, safe 
hbraries do not have to be inefficient hbraries. 

Another serious problem is the following: the standard term ordering dictated 
by ISO Prolog [27] cannot be extended to rational trees [M. Carlsson, Per- 
sonal communication, October 2000]. Consider the rational trees defined by 
A = f (B , a) and B = f (A , b) . Clearly, A == B does not hold. Since the stan- 

^ Even though sort/2 is not required to be a built-in by the ISO Prolog standard, 

it is offered as such by several implementations. 

^ SICStus 3.11 still loops on ?- X = [97|X], name(Y, X). 
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dard term ordering is total, we must have either A @< B or B (§< A. Assume 
A (§< B. Then f (A, b) ®< f (B, a), since the ordering of terms having the 
same principal functor is inherited by the ordering of subterms considered in 
a left-to-right fashion. Thus B @< A must hold, which is a contradiction. A 
dual contradiction is obtained by assuming B @< A. As a consequence, apply- 
ing any Prolog term-ordering predicate to terms where one or both of them 
is infinite may cause inconsistent results, giving rise to bugs that are excep- 
tionally difficult to diagnose. For this reason, any system that extends ISO 
Prolog with rational trees ought to detect such situations and make sure they 
are not ignored (e.g., by throwing an exception or aborting execution with a 
meaningful message). However, predicates such as the term-ordering ones are 
likely to be called a significant number of times, since they are often used to 
maintain structures implementing ordered collections of terms. This is another 
instance of the efficiency issue mentioned above. 

Still on efficiency, it is worth noting that even for built-ins whose definition on 
rational trees is not problematic, there is often a performance penalty in cater- 
ing for the possibility of infinite trees. Thus, for such predicates, which include 
rational unification provided by =/2, a compile-time knowledge of term finite- 
ness can be beneficial. For instance, rational-tree implementations of the built- 
ins ground/1, term_variables/2, copy_term/2, subsumes/2, variant/2 and 
numbervars/3 need more expensive marking techniques to ensure they do not 
enter an infinite loop. With finiteness information it is possible to avoid this 
overhead. 

In this paper, we present a parametric abstract domain for finite-tree analysis, 
denoted hj HxP. This domain combines a simple component H (written with 
the initial of Herbrand and called the finiteness component) recording the set 
of definitely finite variables, with a generic domain P (the parameter of the 
construction) providing sharing information. The term "sharing information" 
is to be understood in its broader meaning, which includes variable aliasing, 
groundness, linearity, freeness and any other kind of information that can 
improve the precision on these components, such as explicit structural infor- 
mation. Several domain combinations and abstract operators, characterized 
by different precision/complexity trade-offs, have been proposed to capture 
these properties (see [28,29] for an account of some of them). By giving a 
generic specification for this parameter component, in the style of the open 
product construct proposed in [30], it is possible to define and establish the 
correctness of abstract operators on the finite-tree domain independently from 
any particular domain for sharing analysis. 

The information encoded by H is attribute independent [31], which means 
that each variable is considered in isolation. What this lacks is information 
about how finiteness of one variable affects the finiteness of other variables. 
This kind of information, usually called relational information, is not cap- 
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tured at all by H and is only partially captured by the composite domain 
H X P. Moreover, H x P is designed to capture the "negative" aspect of 
term-finiteness, that is, the circumstances under which finiteness can be lost. 
However, term-finiteness has also a "positive" aspect: there are cases where 
a variable is granted to be bound to a finite term and this knowledge can be 
propagated to other variables. Guarantees of finiteness are provided by several 
built-ins like unif y_with_occurs_check/2, var/1, name/2, all the arithmetic 
predicates, besides those explicitly provided to test for term-finiteness such as 
the acyclic_term/l predicate of SICStus Prolog. For these reasons H x P 
is coupled with a domain of Boolean functions that precisely captures how 
the finiteness of some variables influences the finiteness of other variables. 
This domain of finite-tree dependencies provides relational information that 
is important for the precision of the overall finite-tree analysis. It also com- 
bines obvious similarities, interesting differences and somewhat unexpected 
connections with classical domains for groundness dependencies. Finite-tree 
and groundness dependencies are similar in that they both track covering in- 
formation (a term s covers t if all the variables in t also occur in s) and share 
several abstract operations. However, they are different because covering does 
not tell the whole story. Suppose x and y are free variables before either the 
unification x — f{y) or the unification x — f{x, y) are executed. In both cases, 
X will be ground if and only if y will be so. However, when x = f{y) is the 
performed unification, this equivalence will also carry over to finiteness. In 
contrast, when the unification is a; = f{x,y), x will never be finite and will 
be totally independent, as far as finiteness is concerned, from y. Among the 
unexpected connections is the fact that finite-tree dependencies can improve 
the groundness information obtained by the usual approaches to groundness 
analysis. 

The paper is structured as follows. The required notations and preliminary 
concepts are given in Section 2. The concrete domain for the analysis is pre- 
sented in Section 3. The finite-tree domain is then introduced in Section 4: 
Section 4.1 provides the specification of the parameter domain P; Section 4.2 
defines some computable operators that extract, from substitutions in ratio- 
nal solved form, properties of the denoted rational trees; Section 4.3 defines 
the abstraction function for the finiteness component H; Section 4.4 defines 
the abstract unification operator for H x P. Section 5 introduces the use of 
Boolean functions for tracking finite-tree dependencies, whereas Section 6 il- 
lustrates the interaction between groundness and finite-tree dependencies. Our 
experimental results are presented in Section 7. We conclude the main body 
of the paper in Section 8. 

Appendix A specifies the sharing domain SFL defined in [32,33] as a possible 
instance of the parameter P. All the results are then proved in Appendix B. 

This paper is a combined and improved version of [34] and [35]. 
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2 Preliminciries 



2.1 Infinite Terms and Substitutions 



The cardinality of a set S is denoted by ^ S; p{S) is the powerset of S, 
whereas pi{S) is the set of aU the finite subsets of S. Let Sig denote a possibly 
infinite set of function symbols, ranked over the set of natural numbers. It is 
assumed that Sig contains at least one function symbol having rank and one 
having rank greater than 0. Let Vars denote a denumerable set of variables 
disjoint from Sig and Terms denote the free algebra of all (possibly infinite) 
terms in the signature Sig having variables in Vars. Thus a term can be seen 
as an ordered labeled tree, possibly having some infinite paths and possibly 
containing variables: every non-leaf node is labeled with a function symbol in 
Sig with a rank matching the number of the node's immediate descendants, 
whereas every leaf is labeled by either a variable in Vars or a function symbol 
in Sig having rank (a constant). 

If t e Terms then vars(i) and mvars(t) denote the set and the multiset of 
variables occurring in t, respectively. We will also write vars(o) to denote the 
set of variables occurring in an arbitrary syntactic object o. 

Suppose s,t & Terms: s and t are independent if vars(s) fl vars(t) = 0; t is 
said to be ground if vars(t) = 0; t is free if t E Vars; if y E vars(t) occurs 
exactly once in t, then we say that variable y occurs linearly in t, more briefiy 
written using the predication occAin{y,t); t is linear if we have occAin{y,t) 
for all y e vars(t); finally, t is a finite term (or Herbrand term) if it contains a 
finite number of occurrences of function symbols. The sets of all ground, linear 
and finite terms are denoted by GTerms, LTerms and HTerms, respectively. 
As we have specified that Sig contains function symbols of rank and rank 
greater than 0, GTerms fl HTerms ^ and GTerms \ HTerms ^ 0. 

A substitution is a total function a : Vars —>■ HTerms that is the identity almost 
everywhere; in other words, the domain of a, 



dom((T) =^ I a; G Vars a{x) 7^ a; |, 



is finite. Given a substitution a: Vars — > HTerms, we overload the symbol 'cr' 
so as to denote also the function a : HTerms — > HTerms defined as follows, for 
each term t e HTerms: 



a{t) 



t, if i is a constant symbol; 

o-{t), ifte Vars; 

,/((7(ii),...,(7(y), ift^f{ti,...,tn). 
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Ute HTerms, we write ta to denote a{t) and tar to denote {ta)T. 



If x e Vars and t G HTerms \ {x}, then x ^ t is called a binding. The set of 
all bindings is denoted by Bind. Substitutions are denoted by the set of their 
bindings, thus a substitution a is identified with the (finite) set 



We denote by vars((T) the set of variables occurring in the bindings of a. 
A substitution is said to be circular if, for n > 1, it has the form 



where distinct variables. A substitution is in rational solved 

form if it has no circular subset. The set of all substitutions in rational solved 
form is denoted by RSubst. 

The composition of substitutions is defined in the usual way. Thus r o a is the 
substitution such that, for all terms t e HTerms, 



As usual, denotes the identity function (i.e., the empty substitution) and, 
when i > 0, cr* denotes the substitution (cr o cr*"^). 

Consider an infinite sequence of terms to,ti,t2, ■ ■ ■ with ti e HTerms for each 
i & N. Suppose there exists t e Terms such that, for each n e N, there exists 
TTT-o G N such that, for each m G N with m > mo, the trees corresponding 
to the terms t and tm coincide up to the first n levels. Then we say that the 
sequence to, ^i, ^2, • • • converges to t and we write t = linij^oo ti [36]. 

For each a G RSubst and t G HTerms, the sequence of finite terms 



converges [36,37]. Therefore, the function rt : HTerms x RSubst — > Terms such 
that 



X ^ xa X & dom(cr) 



{Xi I > X2, . . . , Xji—l I ^ Xji, Xji I > Xi}, 




and has the formulation 



T o a — I X ^ XGT X G dom((7) U dom(T), x 7^ xar 



a'{t),a\t),a'it),... 



rt{t, a) lim a'{t) 



is well defined. 
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^.^ Equations 



An equation is a statement of the form s — t where s,t E HTerms. Eqs denotes 
the set of all equations. As usual, a system of equations (i.e., a conjuction of 
elements in Eqs) will be denoted by a subset of Eqs. A substitution a may be 
regarded as a finite set of equations, that is, as the set {x = t \ x t E a}. A 
set of equations e is in rational solved form ii Is ^ t (s = t)eeje RSubst. 
In the rest of the paper, we will often write a substitution a E RSubst to denote 
a set of equations in rational solved form (and vice versa). 

Languages such as Prolog II, SICStus and Oz are based on TZT, the theory of 
rational trees [1,38]. This is a syntactic equality theory (i.e., a theory where 
the function symbols are uninterpreted), augmented with a uniqueness axiom 
for each substitution in rational solved form. Informally speaking these axioms 
state that, after assigning a ground rational tree to each non-domain variable, 
the substitution uniquely defines a ground rational tree for each of its domain 
variables. Thus, any set of equations in rational solved form is, by definition, 
satisfiable in TZT. Equality theories and, in particular, TZT are presented in 
more detail in Appendix B.1.1. Note that being in rational solved form is a 
very weak property. Indeed, unification algorithms returning a set of equations 
in rational solved form are allowed to be much more "lazy" than one would 
usually expect. For instance, {x = y,y = z} and |a; = f{y),y = f{x)^ are in 
rational solved form. We refer the interested reader to [39,40,41] for details on 
the subject. 

Given a set of equations e e pf(Eqs) that is satisfiable in TZT, a substitution 
a e RSubst is called a solution for e in TZT if TZT h V((7 — > e), i.e., if theory 
TZT entails the first order formula V((7 — > e). If in addition vars((T) C vars(e), 
then a is said to be a relevant solution for e. Finally, o" is a most general 
solution for e in TZT if TZT h \f{a ^ e). In this paper, the set of all the 
relevant most general solutions for e in TZT will be denoted by mgs(e). 

In the sequel, in order to model the constraint accumulation process of logic- 
based languages, we will need to characterize those sets of equations that are 
stronger than (that can be obtained by adding equations to) a given set of 
equations. 

Definition 1 (!(■)) The function RSubst — > p(RSubst) is defined, for 
each a e RSubst, by 

j (7 = { T e RSubst I 3a' e RSubst . r e mgs((7 U a') }. 



The next result shows that J,(-) corresponds to the closure by entailment in 
TZT. 
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Proposition 2 Let a e RSubst. Then 

i a = { r e RSubst | TZT h V(r 



2.3 Boolean Functions 



Boolean functions have already been extensively used for data-flow analysis of 
logic-based languages. An important class of these functions used for tracking 
groundness dependencies is Pos [42] . This domain was introduced in [43] under 
the name Prop and further refined and studied in [44,45]. 

The formal definition of the set of Boolean functions over a finite set of vari- 
ables is based on the notion of Boolean valuation. Note that in all the following 
definitions we abuse notation by assuming that the finite set of variables V is 
clear from context, so as to avoid using it as a suffix everywhere. 

Definition 3 (Boolean valuation.) Let V e pf(Vars) and Bool =^ {0,1}. 
The set of Boolean valuations over V is given by 

Bval =V ^ Bool. 

For each a G Bval, each x E V, and each c e Bool the valuation a[c/x] e Bval 
is given, for each y E V, by 

la(yj, otherwise. 
If X — {xi, . . . ,Xk} C V , then a[c/ X] denotes a[c/xi\ ■ ■ ■ [c/xk\. 
The distinguished elements 0, 1 e Bval are given by 

Xx e V. 0, 

1 Xx e V. 1. 

Definition 4 (Boolean function.) The set of Boolean functions over V is 

Bfun = Bval ^ Bool. 
Bfun is partially ordered by the relation |= where, for each (f),ip & Bfun, 
^ ^ 44 (Va e Bval : 0(a) = 1 =^ = l)- 
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For (j) G Bfun, x & V, and c G Bool, the Boolean function (f)[c/x] G Bfun is 
given, for each a G Bval, by 



(j)[c/x]{a) =^ (j)(a[c/x]j. 

When X CV, (l)[c/X]is defined in the expected way. If G Bfun and x,y eV 
the function (f)[y/x] G Bfun is given, for each a G Bval, by 



0[y/^](«) =^ 0(a 



Boolean functions are constructed from the elementary functions correspond- 
ing to variables and by means of the usual logical connectives. Thus, for each 
X & V, X also denotes the Boolean function such that, for each a G Bval, 
(f){a) = 1 if and only if a{x) = 1; for G Bfun, we write -i0 to denote the 
function ip such that, for each a G Bval, ip{a) = 1 if and only if 0(a) = 0; for 
01,02 G Bfun, we write 0i V 02 to denote the function such that, for each 
a G Bval, 0(a) = if and only if both 0i(a) = and 02(a) = 0. A variable is 
restricted away using Schroder's elimination principle [46]: 

3x . = 0[l/x] V0[O/x]. 

Note that existential quantification is both monotonic and extensive on Bfun. 
The other Boolean connectives and quantifiers are handled similarly. The dis- 
tinguished elements -L, T G Bfun are the functions defined by 

± = Xa G Bval . 0, 
T = Aa G Bval . 1. 

For notational convenience, when X QV, we inductively define 

def iT, if X = 0; 



\xaa[x\{x}), ifxex. 



Pos C Bfun consists precisely of those functions assuming the true value under 
the everything-is-true assignment, i.e.. 



Pos = {0 G Bfun I 0(1) = l}. 



For each G Bfun, the positive part of (f), denoted pos(0), is the strongest Pos 
formula that is entailed by 0. Formally, 



pos(0) = V /\ V. 
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For each (f) e Bfun, the set of variables necessarily true for and the set of 
variables necessarily false for (f) are given, respectively, by 



true(0) =^ I a; G K Va G Bval : 0(a) = 1 
false(0) = { X G I Va G Bval : 0(a) = 1 



a{x) = 1 }) 



a[x 



3 The Concrete Domain 



A knowledge of the basic concepts of abstract interpretation theory [47,48] 
is assumed. In this paper, the concrete domain consists of pairs of the form 
(E, V), where y is a finite set of variables of interest [44] and E is a (possibly 
infinite) set of substitutions in rational solved form. 



Definition 5 (The concrete domain.) Let =^ p(RSubst) x pf(Vars). 
// (S, V) G , then (S, V) represents the (possibly infinite) set of first- order 



formulas | 3A 



a G 



E, A = vars(a) \ } where a is interpreted as the 



logical conjunction of the equations corresponding to its bindings. 



The operation of projecting x 
follows: 



Vars away from (E, V) G is defined as 



3x . (E,V^) = { a' G RSubst 



(T G E, y = Vars \ V, 
TIT h V(3F .{a' ^3x .a 



Concrete domains for constraint languages would be similar. If the analyzed 
language allows the use of constraints on various domains to restrict the values 
of the variable leaves of rational trees, the corresponding concrete domain 
would have one or more extra components to account for the constraints (see 
[49] for an example). 

The concrete element ^ /(?/)}|, {x, y}^ expresses a dependency be- 

tween X and y. In contrast, /(y)}}i {^}) only constrains x. The same 

concept can be expressed by saying that in the first case the variable name 
'y' matters, but it does not in the second case. Thus, the set of variables of 
interest is crucial for defining the meaning of the concrete and abstract de- 
scriptions. Despite this, always specifying the set of variables of interest would 
significantly clutter the presentation. Moreover, most of the needed functions 
on concrete and abstract descriptions preserve the set of variables of inter- 
est. For these reasons, we assume the existence of a set VI G pf(Vars) that 
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contains, at each stage of the analysis, the current variables of interest. ^ As 
a consequence, when the context makes it clear, we will write E e P'' as a 
shorthand for (S, VI) G VK 



4 An Abstract Domain for Finite- Tree Analysis 

Finitc-trcc analysis applies to logic-based languages computing over a domain 
of rational trees where cyclic structures are allowed. In contrast, analyses 
aimed at occurs-check reduction [50,51] apply to programs that are meant 
to compute on a domain of finite trees only, but have to be executed over 
systems that are either designed for rational trees or intended just for the finite 
trees but omit the occurs-check for efficiency reasons. Despite their different 
objectives, finite-tree and occurs-check analyses have much in common: in both 
cases, it is important to detect all program points where cyclic structures can 
be generated. 

Note however that, when performing occurs-check reduction, one can take 
advantage of the following invariant: all data structures generated so far are 
finite. This property is maintained by transforming the program so as to force 
finitencss whenever it is possible that a cyclic structure could have been built. ^ 
In contrast, a finite-tree analysis has to deal with the more general case when 
some of the data structures computed so far may be cyclic. It is therefore 
natural to consider an abstract domain made up of two components. The 
first one simply represents the set of variables that are guaranteed not to be 
bound to infinite terms. We will denote this finiteness component by H (from 
Herbrand) . 

Definition 6 (The finiteness component.) The finiteness component is 
the set H ^ p(VI) partially ordered by reverse subset inclusion. 

The second component of the finite-tree domain should maintain any kind of 
information that may be useful for computing finiteness information. 

It is well-known that sharing information as a whole, therefore including pos- 
sible variable aliasing, definite linearity, and definite freeness, has a crucial role 

This parallels what happens in the efficient implementation of data-flow analyzers. 
In fact, almost all the abstract domains currently in use do not need to represent 
explicitly the set of variables of interest. In contrast, this set is maintained externally 
and in a unique copy, typically by the fixpoint computation engine. 
^ Such a requirement is typically obtained by replacing the unification with a call 
to the standard predicate unif y_with_occurs_check/2. As an alternative, in some 
systems based on rational trees it is possible to insert, after each problematic uni- 
fication, a finiteness test for the generated term. 
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in occurs-check reduction so that, as observed before, it can be exploited for 
finitc-trcc analysis too. Thus, a first choice for the second component of the 
finite-tree domain would be to consider one of the standard combinations of 
sharing, freeness and linearity as defined, e.g., in [28,29,52,53]. However, this 
would tie our specification to a particular sharing analysis domain, whereas 
the overall approach is inherently more general. For this reason, we will define 
a finite-tree analysis based on the abstract domain schema H x P, where the 
generic sharing component P is a parameter of the abstract domain construc- 
tion. This approach can be formalized as an application of the open product 
operator [30] , where the interaction between the H and P components is mod- 
eled by defining a suite of generic query operators: thus, the overall accuracy 
of the finite-tree analysis will heavily depend on the accuracy with which any 
specific instance of the parameter P is able to answer these queries. 

4-1 The parameter Component P 

Elements of P can encode any kind of information. We only require that 
substitutions that are equivalent in the theory TZT are identified in P. 

Definition 7 (The peirameter component.) The parameter component P 
is an abstract domain related to the concrete domain by means of the 
concretization function ^p: P ^ p(RSubst) such that, for all p e P, 

[a e 7p(p) a (uT h V(a ^ r))^ ^ re 7p(p). 

The interface between H and P is provided by a set of abstract operators that 
satisfy suitable correctness criteria. We only specify those that are useful for 
defining abstract unification and projection on the combined domain H x P. 
Other operations needed for a full description of the analysis, such as renaming 
and upper bound, are very simple and, as usual, do not pose any problems. 

Definition 8 (Abstract operators on P.) Let s,t E HTerms be finite 
terms. For each p E P, we specify the following predicates: 

s and t are independent in p if and only if indp : HTerms^ Bool holds for 
{s, t), where 

indp(s, t) ^> Vo" e 7p(p) : vars^rt(s, cr) j fl vars(^rt(i, cr)^ = 0; 

s and t share linearly in p if and only i/share_linp : HTerms^ — > Bool holds for 
{s,t), where 

share_linp(s, t) =^ Vo" e 7p(p) : 
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Vy e vars(^rt(s, a)^ fl vars(^rt(i, a)^ : 

occJin^y, rt(s, a)^ A occJin^y, rt(i, u)^; 

t is ground in p if and only if groundp : HTerms Bool holds for t, where 

groundp(t) =^ Vcr e 7p(p) : rt(t,(T) G GTerms; 

i is ground-or-free in p if and only if gfree^ : HTerms — > Bool holds for t, 
where 

gfreep(t) =^ Vcr G 7p(p) : rt(t, cr) G GTerms V rt(t, cr) G Vars; 

s is linear in p if and only if lin^ : HTerms — > Bool holds for s, where 

linp(s) =^ Vcr G 7p(p) : rt(s, cr) G LTerms; 

s and t are or-linear in p if and only if or Jin^ : HTerms^ — > Bool holds for 
{s,t), where 

or_linp(s, t) Vcr G 7p(p) : rt(s, cr) G LTerms V rt(i, cr) G LTerms; 



For each p E P, the following functions compute subsets of the set of variables 
of interest: 

the function sliare_same_varp : HTerms x HTerms — > p(VI) returns a set of 
variables that may share with the given terms via the same variable. For each 
pair of terms s,t & HTerms, 



share_same_varp(s, t) ^ < y G VI 



3a G 7p(p) . 

3z G vars(rt(y, a)j . 

z G vars(^rt(s, cr)^ fl vars(^rt(t, cr 



>; 



the function sliare_witlip : HTerms — > p(VI) yields a set of variables that may 
share with the given term. For each t G HTerms, 

sliare_withp(t) =^ | y G VI y G share_same_varp(7/, t) |. 



The function amgUp : P x Bind P correctly captures the effects of a binding 
on an element of P. For each {x >—>■ t) & Bind and p G P, let 

p' =^ amgup{p, a; I— > t); 
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for all a e 7p(p), if t & mgsfcr 1} {x — t}\, then r e 7p(p')- 



The function projp : P x VI — >• P correctly captures the operation of projecting 
away a variable from an element of P. For each x e VI, p & P and a e 7p(p); 
ifrEJx. {a}, then r E 7p^projp(p, x)^ . 

As it will be shown in Appendix A, some of these generic operators can be di- 
rectly mapped to the corresponding abstract operators defined for well-known 
sharing analysis domains. However, the specification given in Definition 8, be- 
sides being more general than a particular implementation, also allows for a 
modular approach when proving correctness results. 

4-2 Operators on Substitutions in Rational Solved Form 

There are cases when an analysis tries to capture properties of the particular 
substitutions computed by a specific (ordinary or rational) unification algo- 
rithm. This is the case, for example, when the analysis needs to track structure 
sharing for the purpose of compile-time garbage collection, or provide upper 
bounds on the amount of memory needed to perform a given computation. 
More often the interest is on properties of the (finite or rational) trees that 
are denoted by such substitutions. 

When the concrete domain is based on the theory of finite trees, idempotent 
substitutions provide a finitely computable strong normal form for domain 
elements, meaning that different substitutions describe different sets of finite 
trees (as usual, this is modulo the possible renaming of variables). In contrast, 
when working on a concrete domain based on the theory of rational trees, sub- 
stitutions in rational solved form, while being finitely computable, no longer 
satisfy this property: there can be an infinite set of substitutions in rational 
solved form all describing the same set of rational trees (i.e., the same element 
in the "intended" semantics). For instance, the substitutions 

n 

(7„= f{---f{x)---)] 

for n = 1, 2, . . . , all map the variable x to the same rational tree (which is 
usually denoted by f^). 

Ideally, a strong normal form for the set of rational trees described by a sub- 
stitution cr e RSubst can be obtained by computing the limit function 

a°° = Xt e HTerms . rt(t, a), 

obtained by fixing the substitution parameter of 'rt'. The problem is that, in 
general, a°° is not a substitution: while having a finite domain, its "bindings" 
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X I— >■ liiiij^oo (T*(a;) can map a domain variable x to an infinite rational term. 
This poses a non-trivial problem when trying to define a "good" abstraction 
function, since it would be really desirable for this function to map any two 
equivalent concrete elements to the same abstract element. Of course, it is 
important that the properties under investigation are exactly captured, so as 
to avoid any unnecessary precision loss. Pursuing this goal requires an ability 
to observe properties of (infinite) rational trees while just dealing with one of 
their finite representations. This is not always an easy task since even simple 
properties can be "hidden" when using non-idempotent substitutions. For in- 
stance, when a°° maps variable x to an infinite and ground rational tree (i.e., 
when rt(a;, a) G GTerms \ HTerms), all of its finite representations in RSubst 
(i.e., all the r G RSubst such that TZT \= V((T ^ r)) will map the variable x 
into a finite term that is not ground. These are the motivations behind the 
introduction of the following computable operators on substitutions. 

The groundness operator 'gvars' captures the set of variables that are mapped 
to ground rational trees by rt. We define it by means of the occurrence opera- 
tor 'occ'. This was introduced in [54] as a replacement for the sharing-group 
operator 'sg' of [55]. In [54] the 'occ' operator is used to define a new ab- 
straction function for set-sharing analysis that, differently from the classical 
ones [56,55], maps equivalent substitutions in rational solved form to the same 
abstract element. 

Definition 9 (Occurrence and groundness operators.) For each n eN, 
the occurrence function occ„: RSubst x Vars — > pf(Vars) is defined, for each 
a G RSubst and each v G Vars, by 



The occurrence operator occ: RSubst x Vars pf(Vars) is given, for each 
a G RSubst and v G Vars, by occ((7, v) =^ occ£{a, v), where £ = # cr. 

The groundness operator gvars: RSubst — > pf(Vars) is given, for each substi- 
tution a G RSubst, by 



Then gvars((7) = {x, y, z), although vaxs{xa^) ^ and vaxs{ya^) ^ 0, for all 
< i < oo. 

The finiteness operator is defined, like 'occ', by means of a fixpoint construc- 





y G dom((T) \/v G vars((T) : y ^ occ{a,v) 



Example 10 Let 



{x ^ f{y, z),y^ g{z, x),z^ /(a)}. 
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tion. 



Definition 11 (Finiteness functions.) For eachn G the finitcness func- 
tion hvars„ : RSubst p(Vars) is defined, for each a e RSubst, by 

hvarso((7) =^ Vars \ dom(cr) 

and, for n > 0, by 

hvars„((7) =^ hvars„_i(a") U | y e dom(a) 



vars(ya") C hvars„_i((T) |. 



For each a e RSubst and each i > 0, we have hvarSj((T) C hvarSi+i((j) and 
also that Vars \ hvarsj((T) C dom{a) is a finite set. By these two properties, 
the chain hvarso(cr) C hvarsi((T) C ■ ■ ■ is stationary and finitely computable. 
In particular, if £ = ^a, then, for all n>i, hvars^((T) = hvars„((T). 

Definition 12 (Finiteness operator.) For each a e RSubst, ^/le finiteness 
operator hvars: RSubst — > p(Vars) is given by hvars((T) =^ hvars^((7) where 
I ^ £(a) eN is such that hvars^((7) = hvars„((7) /or a/Z n > £. 

The following proposition shows that the 'hvars' operator precisely captures 
the intended property. 

Proposition 13 If a e RSubst and x e Vars then 

X e hvars((7) <^=^ rt(x, a) e HTerms. 
Example 14 Consider a e RSubst, where 

(T = {xi 1-^ /(xs) , 1-^ 5'(x5) , X3 1-^ f{x4), X4 ^ g(x3) } . 

Then, 

hvarso(o") = Vars \ {xi, X2,X3, X4}, 
hvarsi((j) = Vars \ {xi, X3, X4}, 
hvars2((7) = Vars \ {0:3, X4} 
= hvars (cr). 

Thus, Xi e hvars((T), although \axs{xia) C dom{a). 

The following proposition states how 'gvars' and 'hvars' behave with respect 
to the further instantiation of variables. 

Proposition 15 Let a,T E RSubst, where r e J, cr. Then 

hvars((T) ^ hvars(r), (15a) 
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gvars((7) n hvars((j) C gvars(T) fl hvars(T). 



(15b) 



4.3 The Abstraction Function for H 

A Galois connection between the concrete domain pi(RSubst) and the finite- 
ness component H — p(VI) can now be defined naturally. 

Definition 16 (The Galois connection between p(RSubst) and H.) The 
abstraction function an '■ RSubst H is defined, for each a e RSubst, by 



The concrete domain is related to H by means of the abstraction function 



Since the abstraction function an is additive, the concretization function is 
given by its adjoint [47] ; whenever h & H, 



With these definitions, we have the desired result: equivalent substitutions in 
rational solved form have the same finiteness abstraction. 

Theorem 17 If a,T E RSubst and TZT h V(cr t), then Q;//(cr) = Q;H(r). 



4-4 Abstract Unification and Projection on H x P 

The abstract unification for the combined domain if x P is defined by using the 
abstract predicates and functions as specified for P as well as a new finiteness 
predicate for the domain H. 

Definition 18 (Abstract unification on H x P.) A term t G HTerms is a 
finite tree in h & H if and only if the predicate hterm^ : HTerms — > Bool holds 
for t, where 



Oini'^) = Vlnhvars(cr). 



an: 



H such that, for each E G p(RSubst), 



ME)'Mn{a^(a) |<jgE}. 




hterm/i(i) ^ (^vars(i) C lij. 
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The function amgUj:^ : {H x P) x Bind H captures the effects of a binding 
on an H element. Let {h,p) & H x P and (x i-^ e Bind. Then 

a,mgUff(^{h,p),x i— > t^ =^ h', 

where h' is given by the first case that applies in 



h' 



def 



h U vars(t), 

hU{x}, 

h, 

K 

h \ share_same_varp(a;, t), 



h \ share_withp(a;), 
h \ share_withp(i), 



i/hterm/j(a;) A groundp(a;); 
i/hterm/i(t) A groundp(t); 
if htermh{x) A htermft(t) 

A indp(a;, t) A or Jinp(a;, t); 
i/hterm/j(x) A hterm?i(i) 

A gfreep(a;) A gfreep(i); 
ifhtermh{x) Ahterm/i(i) 

A share_linp(x, t) 

A orJinp(a;, t); 
if htermfi{x) Alinp(a;); 
ifhtermh{t) A \mp{t); 



h \ (^share_withp(a;) U share_withp(i)^, otherwise. 

The abstract unification function amgu: {H x P) x Bind H x P , for any 
{h,p) E H X P and (a; h- >• t) e Bind, is given by 



a,mgu(^{h,p),x i-^ t^ =^ (amgUff^{h,p),x i— > t^ , a,mgup(p, x i-^ t) 



In the computation of h' (the new finiteness component resulting from the 
abstract evaluation of a binding) there are eight cases based on properties 
holding for the concrete terms described by x and t. 

(1) In the first case, the concrete term described by x is both finite and 
ground. Thus, after a successful execution of the binding, any concrete 
term described by t will be finite. Note that t could have contained vari- 
ables which may be possibly bound to cyclic terms just before the exe- 
cution of the binding. 

(2) The second case is symmetric to the first one. Note that these are the only 
cases when a "positive" propagation of finiteness information is correct. In 
contrast, in all the remaining cases, the goal is to limit as much as possible 
the propagation of "negative" information, i.e., the possible cyclicity of 
terms. 

(3) The third case exploits the classical results proved in research work on 
occurs-check reduction [50,51]. Accordingly, it is required that both x 
and t describe finite terms that do not share. The use of the imphcitly 
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disjunctive predicate orJiiip allows for the application of this case even 
when neither x nor t are known to be definitely linear. For instance, as 
observed in [50], this may happen when the component P embeds the 
domain Pos for groundness analysis. ^ 

(4) The fourth case exploits the observation that cyclic terms cannot be 
created when unifying two finite terms that are either ground or free. 
Ground-or-freeness [28,29] is a safe, more precise and inexpensive replace- 
ment for the classical freeness property when combining sharing analysis 
domains. 

(5) The fifth case applies when unifying a linear and finite term with another 
finite term possibly sharing with it, provided they can only share linearly 
(namely, all the shared variables occur linearly in the considered terms). 
In such a context, only the shared variables can introduce cycles. 

(6) In the sixth case, we drop the assumption about the finiteness of the term 
described by As a consequence, all variables sharing with x become 
possibly cyclic. However, provided x describes a finite and linear term, 
all finite variables independent from x preserve their finiteness. 

(7) The seventh case is symmetric to the sixth one. 

(8) The last case states that term finiteness is preserved for all variables that 
are independent from both x and t. 

The following result, together with the assumption on amgUp as specified in 
Definition 8, ensures that abstract unification on the combined domain H x P 
is correct. 

Theorem 19 Let {h,p) E H x P and {x ^ t) E Bind, where {x} U vars(^) C 
VI. Let also a e 7//(/i) ^'yp{p) and h' = amgu^(^(/i,p), x i-^ t}j. Then 

T e mgs(cr U{x = t}) =^ T e jH{h'). 

Abstract projection on the composite domain H x P is much simpler than 
abstract unification, because in this case there is no interaction between the 
two components of the abstract domain. 

Definition 20 (Abstract projection on x P.) The function pro] f^: H x 
VI — > if captures the effects, on the H component, of projecting away a vari- 
able. For each h E H and x e VI, 

proj^(/i,a;) =^ /iU {x}. 

The abstract variable projection function proj : {H x P) x Yl ^ H x P, for 

^ Let t be y. Let also P be Pos. Then, given the Pos formula (/> =^ (x V y), both 
ind^(a;, y) and OTj.m^{x,y) satisfy the conditions in Definition 4. Note that from (p 
we cannot infer that x is definitely linear and neither that y is definitely linear. 
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any {h,p) & H x P and x e VI, is given by 

proj((/i,p),x) =^ (proj^(/i,x),projp(p,x)). 



As a consequence, as far as the H component is concerned, the correctness 
of the projection function does not depend on the assumption on projp as 
specified in Definition 8. 

Theorem 21 Let x G VI, h e H and a G 7_f/(/i). Then 

T e3x . {a} ^ re 7H(projj:^(/i,a;)). 



We do not consider the disjunction and conjunction operations here. The im- 
plementation (and therefore proof of correctness) for disjunction is straight- 
forward and omitted. The implementation of independent conjunction where 
the descriptions are renamed apart is also straightforward. On the other hand, 
full conjunction, which is only needed for a top-down analysis framework, can 
be approximated by combining unification and independent conjunction, ob- 
taining a correct (although possibly less precise) analysis. 

Several abstract domains for sharing analysis can be used to implement the 
parameter component P. As a basic implementation, one could consider the 
well-known set-sharing domain of Jacobs and Langen [55]. In siich a case, 
most of the required correctness results have already been established in [54]. 
Note however that, since no freeness and linearity information is recorded in 
the plain set-sharing domain, some of the predicates of Definition 8 need to 
be grossly approximated. For instance, the predicate gfreCp will provide useful 
information only when applied to an argument that is known to be definitely 
ground. Another possibility would be to use the domain based on pair-sharing, 
definite groundness and definite linearity described in [37]. A more precise 
choice is constituted by the SFL domain (an acronym standing from Set- 
sharing plus Freeness plus Linearity) introduced in [57,33]. Even in this case, 
all the non-trivial correctness results have already been proved. In particular, 
in [32,33] it is shown that the abstraction function satisfies the requirement of 
Definition 7 and that the abstract unification operator is correct with respect 
to rational-tree unification. In order to better highlight the generality of our 
specification of the sharing component P, the instantiation of P to SFL is 
presented in Appendix A. Notice that the quest for more precision does not 
end with SFL: a number of possible precision improvements are presented and 
discussed in [28,29]. 
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5 Finite- Tree Dependencies 



The precision of the finite-tree analysis based on if x P is highly dependent on 
the precision of the generic component P. As explained before, the information 
provided by P on groundness, freeness, linearity, and sharing of variables is 
exploited, in the combination if x P, to circumscribe as much as possible 
the creation and propagation of cyclic terms. However, finite-tree analysis can 
also benefit from other kinds of relational information. In particular, we now 
show how finite-tree dependencies allow a positive propagation of finiteness 
information. 

Let us consider the finite terms ti = f{x), t2 = g{y), and ^3 = h{x,y): it 
is clear that, for each assignment of rational terms to x and y, t^ is finite 
if and only if ti and t'z are so. We can capture this by the Boolean formula 
^3 ^ (^1 A t2).^^ The reasoning is based on the following facts: 

(1) ^1, ^2, and ^3 are finite terms, so that the finiteness of their instances 
depends only on the finiteness of the terms that take the place of x and 

y- 

(2) vars(t3) 5 vars(ti) Uvars(t2), that is, t^ covers both ti and this means 
that, if an assignment to the variables of t^ produces a finite instance 
of ^3, that very assignment will necessarily result in finite instances of ti 
and ^2- Conversely, an assignment producing non-finite instances of ti or 
t2 will forcibly result in a non- finite instance of ^3. 

(3) Similarly, ti and ^2, taken together, cover ^3. 

The important point to notice is that this dependency will keep holding for 
any further simultaneous instantiation of ti, ^2, and t^. In other words, such 
dependencies are preserved by forward computations (which proceed by con- 
sistently instantiating program variables). 

Consider a; 1— > t e Bind where t G HTerms and vars(t) = . . . After 
this binding has been successfully applied, the destinies of x and t concerning 
term-finiteness are tied together: forever. This tie can be described by the 
dependency formula 

x^{yl^■■■ ^yn), (2) 

meaning that x will be bound to a finite term if and only if y^ is bound to a 
finite term, for each i = 1, . . . , n. While the dependency expressed by (2) is a 
correct description of any computation state following the application of the 
binding a; 1— > t, it is not as precise as it could be. Suppose that x and yk are 



° The introduction of such Boolean formulas, called dependency formulas, is origi- 
nally due to P. W. Dart [58]. 
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indeed the same variable. Then (2) is logically equivalent to 



X 



(7/1 A • • • A Uk-i A yk+i A • • • A y„). 



(3) 



Although this is correct — whenever x is bound to a finite term, all the other 
variables will be bound to finite terms — it misses the point that x has just 
been bound, irrevocably, to a non-finite term: no forward computation can 
change this. Thus, the implication (3) holds vacuously. A more precise and 
correct description for the state of affairs caused by the cyclic binding is, 
instead, the negated atom -ix, whose intuitive reading is "x is not (and never 
will be) finite." 

We are building an abstract domain for finite-tree dependencies where we are 
making the deliberate choice of including only information that cannot be 
withdrawn by forward computations. The reason for this choice is that we 
want the concrete constraint accumulation process to be paralleled, at the ab- 
stract level, by another constraint accumulation process: logical conjimction 
of Boolean formulas. For this reason, it is important to distinguish between 
permanent and contingent information. Permanent information, once estab- 
lished for a program point p, maintains its validity in all points that follow p 
in any forward computation. Contingent information, instead, does not carry 
its validity beyond the point where it is established. An example of contin- 
gent information is given by the h component oi H x P: having x E h in 
the description of some program point means that x is definitely bound to a 
finite term at that point; nothing is claimed about the finiteness of x at later 
program points and, in fact, unless x is ground, x can still be bound to a 
non-finite term. However, if at some program point x is finite and ground, 
then X will remain finite. In this case we will ensure our Boolean dependency 
formula entails the positive atom x. 

At this stage, we already know something about the abstract domain we are 
designing. In particular, we have positive and negated atoms, the requirement 
of describing program predicates of any arity implies that arbitrary conjunc- 
tions of these atomic formulas must be allowed and, finally, it is not difficult to 
observe that the merge-over-all-paths operation [47] will be logical disjunction, 
so that the domain will have to be closed under this operation. This means 
that the carrier of our domain must be able to express any Boolean function 
over the finite set VI of the variables of interest: Bfun is the carrier. 

Definition 22 (7^^: Bfun — > p(RSubst).) The function hval: RSubst — > Bval 
is defined, for each a e RSubst and each x e VI, hy 



The concretization function '■ Bfun — > p(RSubst) is defined, for (f) e Bfun, 
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by 



e RSubst Vr e i (7 : (/)(hval(T)) = 1 



The domain of positive Boolean functions Pos used, among other things, for 
groundness analysis is so popular that our use of the domain Bfun deserves 
some further comments. For the representation of finite-tree dependencies, the 
presence in the domain of negative functions such as ->x, meaning that x is 
bound to an infinite term, is an important feature. One reason why it is so 
is that knowing about definite non-finiteness can improve the information on 
definite finiteness. The easiest example goes as follows: if we know that either 
a; or 1/ is finite (i.e., x V y) and we know that x is not finite (i.e., -'x), then 
we can deduce that y must be finite (i.e., y). It is important to observe that 
this reasoning can be applied, verbatim, to groundness: a knowledge of non- 
groundness may improve groundness information. The big difference is that 
non-finiteness is information of the permanent kind while non-groundness is 
only contingent. As a consequence, a knowledge of finiteness and non-finiteness 
can be monotonically accumulated along computation paths by computing the 
logical conjunction of Boolean formulae. An approach where groundness and 
non-groundness information is represented by elements of Bfun would need 
to use a much more complex operation and significant extra information to 
correctly model the constraint accumulation process. 

The other reason why the presence of negative functions in the domain is 
beneficial is efficiency. The most efficient implementations of Pos and Bfun, 
such as the ones described in [42,59], are based on Reduced Ordered Binary 
Decision Diagrams (ROBDD) [60]. While an ROBDD representing the impre- 
cise information given by the formula (3) has a worst case complexity that is 
exponential in n, the more precise formula -ix has constant complexity. 

The following theorem shows how most of the operators needed to compute 
the concrete semantics of a logic program can be correctly approximated on 
the abstract domain Bfun. Notice how the addition of equations is modeled 
by logical conjunction and projection of a variable is modeled by existential 
quantification. 

Theorem 23 Let E,Ei,E2 G p(RSubst) and 0, 0i,02 £ Bfun be such that 
7f(0) ^ E, 7f(0i) 5 El, and 7f(02) 5 E2. Let also {x ^ t) & Bind, where 
{x} U vars(t) C VI. Then the following hold: 
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7^(01 V 02) 2 SiUEa; 



(23e) 
(23f) 



Cases (23a), (23b), and (23d) of Theorem 23 ensure that the following def- 
inition of aragup provides a correct approximation on Bfun of the concrete 
unification of rational trees. 

Definition 24 The function amgu^: Bfun x Bind Bfun captures the effects 
of a binding on a finite-tree dependency formula. Let e Bfun and {x ^ t) & 
Bind be such that {x} U vars(i) C VI. Then 



tA def A ^ A vars(t) ) , if x ^ vars(t); 

amgu^(0, x^t) = \ ^ ' 

' A -la;, otherwise. 



Other semantic operators, such as the consistent renaming of variables, are 
very simple and omitted for the sake of brevity. 

The next result shows how finite-tree dependencies may improve the finiteness 
information encoded in the h component of the domain H x P. 

Theorem 25 Let h e H and (j) e Bfun. Let also h' =^ true(^0 A /\hj. Then 

lH{h) n 7f((/)) = 7/f(/i') n 7f((/>). 

Example 26 Consider the following program, where it is assumed that the 
only "external" query is r(X, Y) ': 

p(X, Y) :- X = f (Y, _) . 
q(X, Y) :- X = f (_, Y) . 

r(X, Y) :- p(X, Y) , q(X, Y) , acyclic_term(X) . 

Then the predicate p/2 in the clause defining r/2 will be called with X and Y 
both unbound. Computing on the abstract domain H xP gives us the finiteness 

description hp = {x, y}, expressing the fact that both X and Y are bound to finite 
terms. Computing on the finite-tree dependencies domain Bfun, gives us the 
Boolean formula 4>p = x ^ y (Y is finite if X is so). 

Considering now the call to the predicate q/2, we note that, since variable 
X is already bound to a non-variable term sharing with Y, all the finiteness 
information encoded by H will be lost (i.e., = 0). So, both X and Y are 
detected as possibly cyclic. However, the finite-tree dependency information is 
preserved, since we have (f)q — {x ^ y) f\ {x ^ y) = x ^ y . 
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Finally, consider the effect of the abstract evaluation ci/acyclic_term(X) . On 
the H X P domain we can only infer that variable X cannot be bound to an in- 
finite term, while Y will be still considered as possibly cyclic, so that = {x}. 
On the domain Bfun we can just confirm that the finite-tree dependency com- 
puted so far still holds, so that (f)r — x ^ y (no stronger finite-tree dependency 
can be inferred, since the finiteness of J. is only contingent). Thus, by applying 
the result of Theorem 25, we can recover the finiteness ofY: 

h'^ = true^0r A A ^'■^ ~ true(^(a; —>■ y) Ax'j = true(a; Ay) = {x, y}. 

Information encoded in H x P and Bfun is not completely orthogonal and the 
following result provides a kind of consistency check. 

Theorem 27 Let h e H and (p e Bfun. Then 



Note however that, provided the abstract operators are correct, the computed 
descriptions will always be mutually consistent, unless 4> — ±. 



6 Groundness Dependencies 

Since information about the groundness of variables is crucial for many ap- 
plications, it is natural to consider a static analysis domain including both a 
finite-tree and a groundness component. In fact, any reasonably precise imple- 
mentation of the parameter component P of the abstract domain specified in 
Section 4 will include some kind of groundness information. We highlight 
similarities, differences and connections relating the domain Bfun for finite- 
tree dependencies to the abstract domain Pos for groundness dependencies. 
Note that these results also hold when considering a combination of Bfun with 
the groundness domain Def [42]. 

We first define how elements of Pos represent sets of substitutions in rational 
solved form. 



One could define P so that it explicitly contains the abstract domain Pos. Even 
when this is not the case, it should be noted that, as soon as the parameter P 
includes the set-sharing domain of Jacobs and Langen [61], then it will subsume the 
groundness information captured by the domain Def [62,63]. 
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Definition 28 (70: Pos p(RSubst).) The function gval: RSubst — > Bval 
is defined as follows, for each a e RSubst and each x e VI; 

gval(a)(a;) ~ 1 <^^^ x e gvars(cr). 

The concretization function 7g : Pos — > p(RSubst) is defined, for each ip e 
Pos, 

"^4^ I a e RSubst Vr e i (J : V'(gval(T)) = 1 1. 

The following is a simple variant of the standard abstract unification operator 
for groundness analysis over finite-tree domains: the only difference concerns 
the case of cyclic bindings [64] . 

Definition 29 The function amgu^j : Pos x Bind Pos captures the effects 
of a binding on a groundness dependency formula. Let ip G Pos and {x ^ t) & 
Bind be such that {x} U vars(t) C VI. Then 

amgU(3('0, X i-^ t) ijj A (x /\(va,Ts{t) \ {x}^ j . 



The next result shows how, by exploiting the finiteness component H, the 
finite-tree dependencies (Bfun) component and the groundness dependencies 
(Pos) component can improve each other. 

Theorem 30 Let h E H , (p E Bfun and ijj G Pos. Let also cf)' G Bfun and 
ip' G Pos be defined as 4>' = 3VI \ h . ijj and ijj' = pos(3VI \ h . (f)). Then 

lH{h) n 7ir(0) n 7g(^/^) = iH{h) n 7^(0) n 7g(^ a ^p')- (30a) 
iH{h) n 7^(0) n 7g(V') = iH{h) n 7f(0 A 0') n 70 (^). (30b) 



Moreover, even without any knowledge of the H component, combining The- 
orem 25 and Eq. (30a), the groundness dependencies component can be im- 
proved. 

Theorem 31 Let G Bfun and t/j G Pos. Then 

jF{(p)r\jGW =7^(0) n7G(V' A/\true((/))). 



The following example shows that, when computing on rational trees, finite- 
tree dependencies may provide groundness information that is not captured 
by the usual approaches. 

Example 32 Consider the program: 
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p(a, Y). 
p(X, a). 

q(X, Y) :- p(X, Y) , X = f (X, Z) . 

The abstract semantics of p/2, for both finite-tree and groundness depen- 
dencies, is (f)p — ijjp = X V y . The finite-tree dependency for q/2 is (f)q ~ 
{x y y) f\ ->x — ->x /\ y . Using Definition 29, the groundness dependency for 
q/2 is 

ijjq — 3z . (^{x V y) A (x <-> z)^ — xV y. 
This can be improved, using Theorem 31, to 

^^9 = V'qA/\true(0j =y. 

It is worth noticing that the groundness information can be improved regard- 
less of whether, hke Pos, the groundness domain captures disjunctive informa- 
tion: groundness information represented by the less expressive domain Def 
[42] can be improved as welL The next example illustrates this point. 

Example 33 Consider the following program: 

p(a, a). 

p(X, Y) :- X = f (X, _). 
q(X, Y) :- p(X, Y) , X = a. 

Consider the predicate p/2. Concerning finite-tree dependencies, the abstract 
semantics of p/2 is expressed by the Boolean formula (f)p = {xAy)\/-ix = x ^ y 
(Y is finite if X is so). In contrast, the Pos- groundness abstract semantics of 
p/2 is a plain "don't know": the Boolean formula ipp = {x Ay) V T — T . In 
fact, the groundness of X and Y can be completely decided by the call-pattern 
of p/2. 

Consider now the predicate q/2. The finiteness semantics of q/2 is given by 
4>q = {x^y)Ax — xAy, whereas the Pos formula expressing groundness 
dependencies is ipg — T A x — x . By Theorem 31, we obtain 

i'q = '^q^ /\ true(0g) = xAy, 
therefore recovering the groundness of variable y. 

Since better groundness information, besides being useful in itself, may also 
improve the precision of many other analyses such as sharing [28,29,62], the 
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reduction steps given by Theorems 30 and 31 can trigger improvements to the 
precision of other components. Theorem 30 can also be exploited to recover 
precision after the application of a widening operator on either the groundness 
dependencies or the finite-tree dependencies component. 



7 Experimental Results 

The work described here has been experimentally evaluated in the framework 
provided by China [64], a data-flow analyzer for constraint logic languages 
(i.e., ISO Prolog, CLP (TV), clp(FD) and so forth). China performs bottom- 
up analysis deriving information on both call-patterns and success-patterns 
by means of program transformations and optimized fixpoint computation 
techniques. An abstract description is computed for the call- and success- 
patterns for each predicate defined in the program. 

We implemented and compared the three domains Pattern(P), Pattern(i/x P) 
and Pattern(Bfun x H x P),^^ where the parameter component P has been 
instantiated to the domain Pos x SFL2 [28,32,33] for tracking groundness, free- 
ness, linearity and (non-redundant) set-sharing information. The Pattern(-) 
operator [49] further upgrades the precision of its argument by adding ex- 
plicit structural information. Note that the analyzer tracks the finitcncss of 
the terms that can be bound to those abstract variables occurring as leaves 
in the acyclic term structure computed by the Pattern(-) component; there- 
fore, in order to show that an abstract variable is definitely bound to a finite 
term, the basic domain Pattern(P) has to prove that this variable is definitely 
free. 

Concerning the Bfun component, the implementation was straightforward, 
since all the techniques described in [59] (and almost all the code, including the 

widenings) was reused unchanged, obtaining comparable efficiency. As a con- 
sequence, most of the implementation effort was in the coding of the abstract 
operators on the H component and in the reduction processes between the 

More precisely, China uses a variation of the Magic Templates algorithm [65], in 
order to obtain goal-dependent information, and a sophisticated chaotic iteration 
strategy proposed in [66,67] (recursive fixpoint iteration on the weak topological 
ordering defined by partitioning of the call graph into strongly-connected subcom- 
ponents). 

For ease of notation, the domain names are shortened to P, H and B, respectively. 

Put in other words, by considering just the variables occurring inside the pattern 
structure, wc systematically disregard those cases when the basic domain is able 
to prove that a particular argument position is definitely bound to a finite and 
ground term such as /(a). Clearly, the same approach is consistently adopted when 
considering the more accurate analysis domains. 
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different components. A key choice, in this sense, is when the reduction steps 
given in Theorems 25 and 30 should be apphed. When striving for maximum 
precision, a trivial strategy is to perform reductions immediately after any ap- 
plication of any abstract operator. This is how predicates like acyclic_term/l 
should be handled: after adding the variables of the argument to the H compo- 
nent, the reduction process is applied to propagate the new information to all 
domain components. However, such an approach turns out to be unnecessarily 
inefficient. In fact, the next result shows that Theorems 25 and 30 cannot lead 
to a precision improvement if apphed just after the abstract evaluation of the 
merge-over-all-paths or the existential quantification operations (provided the 
initial descriptions are already reduced). 

Theorem 34 Let x G VI, h,h' E H (p, 0' G Bfun and ip, ip' G Pos and suppose 
that -fnih) n 7ir(0) 7^ 0. Let 

/i2 projj:j(/i, x), 02 3x . 0, ■02 3x . ■0. 

Let also 

/i Dtrue(^0A/\/i), ^ (3VI \ /i . -0), ip ^ pos{3Yl\h . (p), 

h' D true (^(j)' A/\h'^, 0' h (3VI \ h' . i^') , i^' h pos(3VI \ h! . 0') . 

Then, for i = 1, 2, 

hi 2 true^0i A /\K^, 0^ |= (3VI \ hi . ipi), ipi ^ pos(3VI \ hi . 0^). 



A goal-dependent analysis was run for all the programs in our benchmark 
suite. For 116 of them, the analyzer detects that the program in not amenable 
to goal-dependent analysis, either because the entry points are unknown or 
because the program uses builtins in a way that every predicate can be called 
with any call-pattern, so that the analysis provides results that are so impre- 
cise to be irrelevant. The precision results for the remaining 248 programs are 
summarized in Table 1. Here, the precision is measured as the percentage of 

The suite comprises all the logic programs we have access to (including everything 
we could find by systematically dredging the Internet): 364 programs, 24 MB of 
code, 800 K lines. Besides classical benchmarks, several real programs of respectable 
size are included, the largest one containing 10063 clauses in 45658 lines of code. 
The suite also comprises a few synthetic benchmarks, which are artificial programs 
explicitly constructed to stress the capabilities of the analyzer and of its abstract 
domains with respect to precision and/or efficiency. The interested reader can find 
more information at the URI http://www.cs.unipr.it/China/. 
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Prec. class 


P 


H 


B 


p — iUU 


9 


o4 


oD 


an <' m ^ 1 nn 
oU ^ p <. iUU 


1 
1 


oi 


1f{ 
OO 


60 < p < 80 


7 


26 


23 


40 < p < 60 


6 


41 


40 


20 < p < 40 


47 


47 


46 


< p < 20 


185 


19 


17 



Prec. improvement 


P H 


H ^ B 


i > 20 


185 


4 


10 < z < 20 


31 


3 


5 < i < 10 


11 


6 


2<i<5 


4 


10 


< i < 2 


2 


24 


no improvement 


15 


201 



Table 1 

The precision on finite variables when using P, H and B. 

the total number of variables that the analyser can show to be finite. Two 
alternative views are provided. 

In the first view, each column is labeled by an analysis domain and each row is 
labeled by a precision interval. For instance, the value '31' at the intersection 
of column 'H' and row '80 < p < 100' is to be read as "/or 31 benchmarks, the 
percentage p of the total number of variables that the analyzer can show to be 
finite using the domain H is between 80% and 100% T 

The second view provides a better picture of the precision improvements ob- 
tained when moving from P to H (in the column 'P — > H') and from H to B 

(in the column 'H — ^ B'). For instance, the value '10' at the intersection of 
column 'H B' and row '2 < z < 5' is to be read as ^'when moving from H to 
B, for 10 benchmarks the improvement i in the percentage of the total number 
of variables shown to be finite was between 2% and 5%." 

It can be seen from Table 1 that, even though the H domain is remarkably pre- 
cise, the inclusion of the Bfun component allows for a further, and sometimes 
significant, precision improvement for a number of benchmarks. It is worth 
noting that the current implementation of China does not yet fully exploit 
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the finite-tree dependencies arising when evaluating many of the built-in pred- 
icates, therefore incurring an avoidable precision loss. We are working on this 
issue and we expect that the specialized implementation of the abstract evalu- 
ation of some built-ins will result in more and better precision improvements. 
The experimentation has also shown that, in practice, the Bfun component 
does not improve the groundness information. 

Concerning efficiency, our experimentation shown that the techniques we pro- 
pose are really practical. The total analysis time for the 248 programs for 
which we give precision results in Table 1 is 596 seconds for P, 602 seconds for 
H, and 1211 seconds for B. It should be stressed that, as mentioned before, 
the implementation of Bfun was derived in a straightforward way from the one 
of Pos described in [59]. We believe that a different tuning of the widenings 
we employ in that component could reduce the gap between the efficiency of 
H and the one of B. 



8 Conclusion 

Several modern logic-based languages offer a computation domain based on 
rational trees. On the one hand, the use of such trees is encouraged by the pos- 
sibility of using efficient and correct unification algorithms and by an increase 
in expressivity. On the other hand, these gains are countered by the extra 
problems rational trees bring with themselves and that can be summarized 
as follows: several built-ins, library predicates, program analysis and manip- 
ulation techniques are only well-defined for program fragments working with 
finite trees. 

As a consequence, those applications that exploit rational trees tend to do so 
in a very controlled way, that is, most program variables can only be bound 
to finite terms. By detecting the program variables that may be bound to 
infinite terms with a good degree of accuracy, we can significantly reduce the 
disadvantages of using rational trees. 

In this paper we have proposed an abstract-interpretation based solution to 
this problem, where the composite abstract domain H x P allows tracking of 
the creation and propagation of infinite terms. Even though this information 
is crucial to any finite-tree analysis, propagating the guarantees of finiteness 
that come from several built-ins (including those that are explicitly provided 
to test term- finiteness) is also important. Therefore, we have introduced a 
domain of Boolean functions Bfun for finite-tree dependencies which, when 

^6 On a PC system equipped with an Athlon XP 2800 CPU, 1 GB of RAM memory 
and running GNU/Linux. 
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coupled to the domain HxP, can enhance its expressive power. Since Bfun has 
many similarities with the domain Pos used for groimdness analysis, we have 
investigated how these two domains relate to each other and, in particular, 
the synergy arising from their combination in the "global" domain of analysis. 
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A An Instance of the Pcirameter Domain P 



As discussed in Section 4, several abstract domains for sharing analysis can be 

used to implement the parameter component P. We here consider the abstract 
domain SFL [32,33], integrating the set-sharing domain of Jacobs and Langen 
with definite freeness and linearity information. 



Definition 35 (The set-sharing domain SH.) The set SH is defined by 

SH = p{SG), where SG = 
ordered by subset inclusion. 



SH = p{SG), where SG = p(VI) \ {0} is the set of sharing groups. SH is 



The information about definite freeness and linearity is encoded by two sets 
of variables, one for each property. 

Definition 36 (The domain SFL.) Let F = p{Vl) and L = p(VI) be 
partially ordered by reverse subset inclusion. The domain SFL is defined by 
the Cartesian product SFL =^ SH x FxL ordered by '<s the component-wise 
eoctension of the orderings defined on the sub- domains; the bottom element is 
±5 = (0, VI, VI). 

In the next definition we introduce a few well-known operations on the set- 
sharing domain SH. These will be used to define the operations on the domain 
SFL. 

Definition 37 (Abstract operators on SH.) For each sh G SH and each 
V C VI, the extraction of the relevant component of sh with respect to V is 
given by the function rel : p( VI) x SH — > SH defined as 

Tel{V, sh) = { 5 e sh I 5 n 1/ 7^ }. 



For each sh e SH and each V C VI, the function rel: p(VI) x SH SH gives 
the irrelevant component of sh with respect to V. It is defined as 

?il(y,sh) = sh\rel(y,sh). 



The function {■)*: SH — > SH, called star-union, is given, for each sh e SH, by 



sh* 'M \Se SG 



3n>1.3Ti,...,T^esh.S^[jTi \. 



i=l 



For each shi, sh2 G SH, the function bin : SH x SH — > SH, called binary union, 
is given by 

bin(shi, shs) = {SiUS2\Sie shi, S2 e sh2 }. 
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For each sh G SH and each (x t) E Bind, the function cyclic^ : SH — > SH 
strengthens the sharing set sh by forcing the coupling of x with t: 




For each sh G SH and each x G VI, the function projgjj: SH x VI ^ SH 
projects away variable x from sh; 



It is now possible to define the implementation, on the domain SFL, of all the 
predicates and functions specified in Definition 8. 

Definition 38 (Abstract operators on SFL.) For each d G SFL and 

s,t E HTerms, where d = (sh, /, /) and vars(s) U vars(i) C VI, let sh^ = 
rel(vars(s), sh j and shj = rel(vars(i), sh j . Then 



share_same_vard(s, t) = vars(sh5 fl sh^); 
share_withd(t) == vars(sht). 

The function amgu^: SFL x Bind — > SFL captures the effects of a binding 
on an element 0/ SFL. Let d = (sh, /, /) G SFL and {x t-^ t) E Bind, where 
{x} U vars(t) C VI. Let also 



projsH(sh, x) = [{x}} U [S\{x}\S Esh,S {x}}. 




mdd{y,z 



)) 



) 




sh' = cyclic^ (sh_ Ush"), 
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where 



sh^ = rel({a;}, sh), 



shf =^ rel(^vars(t), sh 



sh^t = sh^ n shj, 



sh. 



def 



sh 



rel(^{a;} U vars(t), shj , 
i/ freed (a;) V freed(t); 



'bin(sh^,sht), 
bin(^sh^. U biii(sh2;, sh*J, 

,/ def sht U bin(sht, sh*j) , if hnd(x) A hnd(i); 

bin(sh*,sht), if lmd{x); 

bin(sh3,,sh*), iflmd{t); 

_bin(sh*, shj), otherwise. 



Letting Sx share_withd(x) and St share_withd(t), we also define 

7, z/ freed (x) A freed (t); 

d_cf I / \ Sx, i/ freed (x); 

f\St: i/ freed (t); 

./ \ {Sx U St), otherwise; 

r (VI \ vars(sh')) U /' U I", 



f 



where 



I 



II def 



Then 



'l\{Sx^St), z/hnd(2;) Ahnd(^); 
/\5'a;, i/hnd(.T); 
l\St, z/hnd(t); 
I \ [Sx U S't), otherwise. 

amgU5(d, X ^ = (sh', /', ]!). 



The function proj^: SFL x VI — SFL correctly captures the operation of pro- 
jecting away a variable from an element of SFL. For each d G SFL and x G VI, 



proj5(d,a;) 



def I -L5, «/d = ±s; 

prois^{sh,x),fLi{x},lLi{x}), if d ^ {sh,f,l) ^±3. 



Observe that a set-sharing domain such as SFL is strictly more precise for 
term finiteness information than a pair-sharing domain such as SFL2 [32,33] 
(where the set-sharing component SH in SFL is replaced by the domain PSD 
as defined in [69,70]). To see this, consider the abstract evaluation of the 
binding x 1— > y and the description {h, d) G if x SFL, where h — {x, y, z} 
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and d = (sh, /, Z) is such that sh = y}, {x, 2;}, {y, f — and 
I — {x.y. z}. Then z ^ share_same_vard(a;, y) so that we have h' = {z}. 
In contrast, when using a pair sharing domain such as SFL2 the element d 
is equivalent to d' = (sh',/, /), where sh' = sh U y,z}|. Hence we have 
z e share_same_vard' (x, y) and h' = 0. Thus, in sh the information provided 
by the sharing group {x, y, z} is redundant for the pair-sharing and ground- 
ness properties, but not redundant for term finiteness. Note that the above 
observation holds regardless of the pair-sharing variant considered, so that 
similar examples can be obtained for ASub [71,51] and Sh^^'' [72]. 

Although the domain SFL described here is very precise and used to imple- 
ment the parameter component P for computing our experimental results, 
it is not intended as the target of the generic specification given in Defini- 
tion 8; more powerful sharing domains can also satisfy this schema, including 
all the enhanced combinations considered in [28,29]. For instance, as the pred- 
icate gfree^ defined on SFL does not fully exploit the disjunctive nature of 
its generic specification gfree^, the precision of the analysis may be improved 
by adding a domain component explicitly tracking ground-or-freeness, as pro- 
posed in [28,29]. The same argument applies to the predicate orJiud, with 
respect to or Jin^, when considering the combination with the groundness do- 
main Pos. 



B Proofs of the Stated Results 

This appendix provides the proofs of the results stated in the paper. Sec- 
tion B.l introduces the notations and preliminary concepts that are subse- 
quently used in the proofs. In Section B.2 we recall a few general results hold- 
ing for (syntactic) equality theories and provide the proof of Proposition 2. 
The definition of (strongly) variable idempotent substitutions is given in Sec- 
tion B.3, together with some properties holding for them; these are then used 
in Section B.4 to prove some general results on operators on substitutions in 
RSubst, Propositions 13 and 15. Section B.4 is propaedeutic to Section B.5, 
where we prove Theorem 17 and to Section B.6, where we provide the proofs 
of Theorems 19 and 21. Results in Section B.4 are then used in Section B.7 
to prove Theorems 23, 25 and 27, and in Section B.8 to prove Theorems 30 
and 34. 

B. 1 Notations and Preliminaries for the Proofs 

To simplify the expressions in the paper, any variable in a formula that is not 
in the scope of an explicit quantifier is assumed to be universally quantified. 
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A path p e {N\ {0})* is any finite sequence of non-zero natural numbers. 

The empty path is denoted by e, whereas i . p denotes the path obtained 
by concatenating the sequence formed by the natural number i ^ with the 
sequence of the path p. Given a pathp and a (possibly infinite) term t e Terms, 
we denote by t\p] the subterm of t found by following path p. Formally, 



t\p] 



t ii p — e; 

ti[q] if p = i . g A (1 < i < n) At = f(ti, . ..,tn). 



Note that t\p] is only defined for those paths p actually corresponding to 
subterms of t. 

The function size: HTerms — > N is defined, for each t e HTerms, by 



/,\ def 

size(rj — 



1, if i e Vars; 

1 + Er=i size(ii), if i = f(ti, ...,tn), where n > 0. 



A substitution a is idempotent if, for all t e HTerms, we have taa — ta. The 
set of all idempotent substitutions is denoted by ISubst and ISubst C RSubst. 

If t e HTerms, we denote the set of variables that occur more than once in t 
by: 

nlvars(t) =^ 1 1/ G vars(t) -ioccJin(|/, t) |. 

If s = (si, . . . , Sn) e HTerms" and t — {ti, . . . ,tn) e HTerms" are two tu- 
ples of finite terms, then we let s = t denote the set of equations between 
corresponding components of s and t. Namely, 

{s — t)^ {si — ti \ 1 < i < n }. 

Moreover, we overload the functions mvars, occJin and nlvars to work on 
tuples of terms; thus, we will say that s is linear if and only if nlvars (s) = 0. 



B.1.1 Equality Theories 

Let {s, si, . . . , Sm ti, . . . , tm} Q: HTerms. We assume that any equality the- 
ory T over Terms includes the congruence axioms denoted by the following 
schemata: 

s = (B.l) 

s^t^t^s, (B.2) 

r^sAs^t^r^t, (B.3) 

S^^tiA---ASn^tn^ /(Si, . . . , S„) = f{ti, . . . , ^n)- (B.4) 
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In logic programming and most implementations of Prolog it is usual to assume 
an equality theory based on syntactic identity. This consists of the congruence 
axioms together with the identity axioms denoted by the following schemata, 
where / and g are distinct function symbols or n ^ m: 

/(Sl, . . . , S„) = f{ti, . . . ,tn) ^ Si ^ ti A ■ ■ ■ A Sn ^ tn, (B.5) 

^(/(si,...,Sn) ^ g(ti,...,tm)). (B.6) 

The axioms characterized by schemata (B.5) and (B.6) ensure the equality 
theory depends only on the syntax. The equality theory for a non-syntactic 
domain replaces these axioms by ones that depend instead on the semantics of 
the domain and, in particular, on the interpretation given to functor symbols. 

The equality theory of Clark [73] on which pure logic programming is based, 
usually called the Herbrand equality theory and denoted J^T, is given by the 
congruence axioms, the identity axioms, and the axiom schema 

\/z e Vars : Wt G (HTerms \ Vars) : z G vars(t) — >• -'{z = t). (B.7) 

Axioms characterized by the schema (B.7) are called the occurs-check ax- 
ioms and are an essential part of the standard unification procedure in SLD- 
resolution. 

An alternative approach used in some implementations of Prolog, does not 
require the occurs-check axioms. This approach is based on the theory of 
rational trees TZT [1,38]. It assumes the congruence axioms and the identity 
axioms together with a uniqueness axiom, for each substitution in rational 
solved form. Informally speaking these state that, after assigning a ground 
rational tree to each parameter variable, the substitution uniquely defines a 
ground rational tree for each of its domain variables. 

In the sequel we will use the expression "equality theory" to denote any con- 
sistent, decidable theory T satisfying the congruence axioms. We will also use 
the expression "syntactic equality theory" to denote any equality theory T 
also satisfying the identity axioms. ^'^ Note that both JFT and TZT are syntac- 
tic equality theories. When the equality theory T is clear from the context, it 
is convenient to adopt the notations a =^ r and a <^==^ r, where a, r are 
sets of equations, to denote T h V((T — > r) and T h V(cr ^ r) , respectively. 

Given an equality theory T, and a set of equations in rational solved form a, 
we say that a is satisfiable in T if T h Wars\ dom((j) : 3 dom(cr) . a. Observe 

Note that, as a consequence of axiom (B.6) and the assumption that there are 
at least two distinct function symbols in the language, one of which is a constant, 
there exist two terms ai, 02 G GTermsR HTerms such that, for any syntactic equality 
theory T, we have T h ai 7^ 02- 
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that, given an arbitrary equality theory T, a substitution in RSubst may not 
be satisfiable in T. For example, 3x . = f{x)^ is false in the Clark equality 
theory. However, as every element of RSubst satisfies the identity axioms as 
well as the axioms (B.5) and (B.6) and, as the uniqueness axioms do not affect 
satisfiabihty, every element of RSubst is satisfiable in TZT. 



B.2 Properties of Equality Theories 



Proof of Proposition 2 on page 10. Suppose t & [a. Then, by Defini- 
tion 1, for some v e RSubst, r e m.gs{a\Jv). and hence TZT h V(^t ^ (aUi;)j 
Therefore 7^rh V(t^ cr). 

Conversely, suppose 7^r h V(r a). Then TZT h v(r ^ (a U r)) so that, a; 

h V(^(a U r) ^ r^, we have IZT h \/{t ^ a U t)). Therefore r e mgs(cr U r 
so that, by Definition 1, r e J, cr. □ 



We now prove a number of results about substitutions in RSubst, assuming 
suitable equality theories, that will be used in the proofs of our main results. 

Lemma 39 Let a e RSubst and {x i— > e RSubst he hath satisfiable in the 
equality theory T , where x ^ dom((T) and vars(t) fl dom((T) = 0. Define also 
cr' =^ (7 U {x I— > i}. Then a' G RSubst and a' is satisfiable in T . 



PROOF. Note that a' is a substitution, since a e RSubst and x ^ dom((7). 
Moreover, as vars(t) ndom((T) = 0, a' cannot contain circular subsets. Hence, 
cr' e RSubst. 

Since both cr and {x i— > t} are satisfiable in T, we have 

T h Wars \ dom(o') : 3 dom.{a) . cr, 
T h Wars \ {a;} : 3a; . {a; = t}. 

Letting V — Vars \ (^dom(cr) U {a;}^ , we can rewrite these as 

T h Vy : Vx : 3 dom(cT) . a, (B.8) 
ThVy :Vdom(cr) :3a;.{a; = 0- (B.9) 

Then, as vars(a; = t) fl dom(cr) = 0, it follows from (B.9) that 
T h Vy : 3a: . {x = 0- 
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Combining this with (B.8) gives 

T\-W : (^(yx :3 doui{a) . a) A (3x . {x = t})^ . 

Thus we have 

T\-yV:3x . (3dom((7) . a A {x ^ t}), 
and hence, as vars(x — t) f] dom(cr) = 0, 

ThW :3x . 3dom(a) . {a A {x = t}). 

Therefore, 

T\-W : 3(dom((j) U {x}) .aU{x = t}. 
Thus a' is satisfiable in T. □ 

Lemma 40 Assume T is an equality theory and a e RSubst. Then, for each 
t e HTerms, 

T\-y(a^ {t = ta)). 
PROOF. Proved in [54, Lemma 2]. □ 

Lemma 41 Assume T is an equality theory and a e RSubst. Then, for each 
s,t E HTerms, 

T \- y(a [J {s ^ t} ^ a [J {s ^ itj}). 

PROOF. First, note, using the congruence axioms (B.2) and (B.3), that, for 
any terms p,q,r E HTerms, 

rhV(p = ?Ag = r) ^V(p = r A5 = r). (B.IO) 

Secondly note that, using Lemma 40, for any substitution r e RSubst and 
term r e HTerms, T h V^r {r — rr)^. Thus 

T h V(t ^ T U {r = rr}) . (B.ll) 

Using these results, we obtain 

T\-y(aU{s^t} ^ aU{s^t,t^ta}), [by (B.ll)] 
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T \- y(a U {s ^ t} ^ a U {s ^ ta,t ^ ta}) , [by (B.IO)] 

T \- y(a Li {s ^ t} ^ a U {s ^ ta}). [by (B.ll)] 

□ 

Lemma 42 Let a e RSubst be satisfiable in a syntactic equality theory T and 
s, t e HTerms, where T h V((t ^ (s = i)) . Then rt(s, a) = rt{t, a). 

PROOF. We suppose, towards a contradiction, that rt(s, a) ^ rt(t, cr). Then 
there exists a finite path p such that: 

a. X = rt(s, cr)[p] e Vars\dom(cr), y = rt(t, cr)[p] e Vars\dom((j) and x ^ y\ 
or 

b. X = rt(s, cr)[p] e Vars \ dom((j) and r = rt(t, cr)[p] ^ Vars or, symmetri- 
cally, r = rt(s, ^ Vars and x = rt(t, cr)[p] e Vars \ dom(cr); or 

c. ri = rt(s, o')[p] ^ Vars, r2 = rt(i, (7)[p] ^ Vars and r\ and r2 have difi^erent 
principal functors. 

Then, by definition of 'rt', there exists an index i e N such that one of these 
holds: 

\. X — S(t'[p] e Vars \ dom(cr), y = tcr*[p] G Vars \ dom{a) and x ^ y; or 

2. X — S(t'[p] e Vars \ dom((T) and r = icr'[p] ^ Vars or, in a symmetrical 
way, r = sa'^\p] ^ Vars and x = tcr*[p] G Vars \ dom((j); or 

3. ri = S(7'[p] ^ Vars and r2 = ta^[p] ^ Vars have different principal func- 
tors. 

By Lemma 40, we have T h v(cT ^ (sa' = ta')); from this, since T" is a 
syntactic equality theory, we obtain that 

T h V((7 ^ {sa'[p] = ta'\p])). (B.12) 
We now prove that each case leads to a contradiction. 

Consider case 1. Let ri,r2 G GTerms fl HTerms be two terms having different 
principal functors, so that T h V(ri ^ r2). Then, as cr is satisfiable in T, by 
Lemma 39, we have that a' = a U {a; i— > ri, |/ i— > r2} G RSubst is satisfiable 
in T and also T h V((7' ^ a), T h V((t' ^ (x = n)), T h V((7' ^ (|/ = r2)). 

This is a contradiction, since, by (B.12), we have T h v((t ^ (x = y)). 

Consider case 2. Without loss of generality, consider the first subcase, where 
X = scT*[p] G Vars \ dom(o') and r = tcr^[p] ^ Vars. Let r' G GTerms fl HTerms 
be such that r and r' have different principal functors, so that T h V(r 7^ r'). 
By Lemma 39, as a is satisfiable in T, cr' = a[J{x 1— > r'} G RSubst is satisfiable 
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in T; we also have that T h V((j' a) and T h V((7' ^ (x = r')). This is a 
contradiction as, by (B.12), T h V((7 ^ (x = r)). 

Finally, consider case 3. In this case T h V(ri 7^ r2). This immediately leads 
to a contradiction, since, by (B.12), T h V^cr — >• (ri = r2)). □ 

Lemma 43 Let T he a syntactic equality theory. Let s e HTerms fl GTerms 
ano? t e Terms &e suc/i t/iat size(i) > size(s). Then T h V(s ^ t). 

PROOF. By induction on m = size(s). For the base case, when m = 1, we 
have that s is a term functor of arity 0. Since size(i) > 1, then t — f{ti, . . . ,tn), 
where n > 0. Then, by the identity axioms, we have T h V(s 7^ t). 

For the inductive case, when m > 1, assume that the result holds for all 
m' < m and let s — f{si, . . . , s„), where n > 0. Since size(i) > m, we have 
t — f'{ti, . . . ,tn'), where n' > 0. If / 7^ /' or n 7^ n' then, by the identity 
axioms, we have T h V(s 7^ t). Otherwise, let f = f and n = n' . Note 
that, for all i G {1, . . . ,n}, we have size(sj) < m. Also, there exists an index 
j G {!,..., n} such that size(tj) > size(sj). By the inductive hypothesis, 
T h y{sj 7^ tj) so that, by the identity axioms, T h V(s 7^ t). □ 



The next two propositions establish useful properties of the function rt. 

Proposition 44 Let a G RSubst and t G HTerms. Then 

vars(^rt(t, o-))n dom(o-) = 0, (44a) 
rt(t, a) G HTerms <^=^ 3i G N . rt(t, a) = ta\ (44b) 



PROOF. 

(44a) Let x G dom((T) and, towards a contradiction, suppose x G vars(^rt(t, cr)^ . 
Thus, there exists a finite path p such that x — rt(t, (t)[p]. Thus, by definition 
of 'rt', there exists an index i G N such that x = a^{t)[p]. Since x G dom((j), 
then x 7^ xa, so that x 7^ (7^'^^{t)[p]. Also note that, since a G RSubst, a 
contains no circular subsets, so that we have x 7^ a-'{t)[p]. for each index 
j > i. This implies x 7^ rt{t,a)\p], which is a contradiction. Since no such 
finite path p can exist, we can conclude x ^ vars(^rt(t, cr)). 

(44b) Since substitutions map finite terms into finite terms, a finite number of 
applications cannot produce an infinite term, so that the left implication holds. 
Proving the right implication by contraposition, suppose that rt(t, cr) 7^ ta\ 
for all i G N. Then, by definition of 'rt', we have ta"- 7^ i(7*+^ for all i G N. 
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Letting n G N be the number of bindings in cr e RSubst, for all i G N we 
have that sizc(to"*) < size(to'*+"), because a has no circular subsets. Thus 
rt(t, cr) ^ HTerms, because there is no finite upper bound to the number of 
function symbols occurring in rt(t, cr). □ 



Proposition 45 Let a,T G RSubst be satisfiable in a syntactic equality theory 
T andW C Vars, where T h y{3W . T^3W.a), and x e Vars \ W. Then 

rt(x, r) e HTerms rt(x, cr) G HTerms. 



PROOF. We assume that rt(a;, r) G HTerms but rt(x, o") ^ HTerms and 
derive a contradiction. By hypothesis rt(x, r) G HTerms, so that by Propo- 
sition 44, there exists i E N such that rt(a;, r) = xr* and also vars(a;r*) fl 
dom(T) = 0. Let t G GTerms n HTerms and 



y t y E vars(xr*) |. 



Then, as r is satisfiable in T, by Lemma 39, r' t U v G RSubst is also 
satisfiable in T. Moreover, we have that xrV' G GTerms n HTerms. Define 
now n =^ size(a;rV). As rt(a;, a) ^ HTerms, there exists j G N such that 
size(x(T-') > n. Therefore, by Lemma 43, 

T\-y{xr'r' y^xa^). (B.14) 

By Lemma 40, 

rhV(r ^ (a; = a;r*)). (B.15) 

Also, by Lemma 40, T h y(a (x — xa^)^ so that, as T is a first-order 
theory, 

T h W(3W .a^3W.{x^ xa^)) . (B.16) 

By definition of r', V(r' r). Hence, by hypothesis and the logically true 
statement V(r 3W . r), we obtain T h V(r' 3W . a). Observe that 
vars(a; = xt'^t') = {x} and, as a consequence, vars(a; = xtW) H W = 0. 
Therefore, by (B.15) and (B.16), we obtain 

T h V(t' ^ (x = xt't' A3W.x^ xa^)) 

<^ T h V(t' ^3W.(x^ xt't' a X = xa^)) 
^ T h V(r' ^ 3W. {xtW' = xa' 

which contradicts (B.14). □ 
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B.3 Variable- Idempotence 



In [54] , (weak) variable- idempotent substitutions were introduced as a subclass 
of substitutions in rational solved form in order to allow a more convenient 
reasoning about the sharing of variables for possibly non-idempotent substi- 
tutions. In [74] a stronger definition was used, taking into consideration also 
the variables in the domain of the substitution. Strong variable-idempotence 
is a useful concept when dealing with the finiteness of a rational term and the 
multiplicity of variables occurring in it (e.g., when linearity is a property of 
interest). In the following we consider this stronger definition, also adopted 
in [32,33]. 

Definition 46 (Variable-idempotence.) A substitution a e RSubst is said 
to be (strongly) variable- idempotent if and only if for all t e HTerms we have 

vars(t(Tcr) = vars(t(T). 

The set of variable-idempotent substitutions is denoted VSubst. 

Note that we have ISubst C VSubst C RSubst. 

Definition 47 (<S-transformation.) The relation i — > C RSubst x RSubst, 
called »S-step; is defined by 

{x ^ t) E a {y ^ s) E a x ^ y 
a {y\{y ^ "S}) U H- >• s{x H- >• t}| 

If we have a finite sequence of S -steps ai i — > ■ ■ ■ i — > an mapping ai to an, 
then we write a\ i — >* an and say that ai can be rewritten, by S -transformation, 
to an- 

The following theorems show that considering substitutions in VSubst is not 
a restrictive hypothesis. 

Theorem 48 Suppose a G RSubst and a i — >* a'. Then we have a' £ RSubst, 
dom{a) = dom(o''), and vars((j) = vars(cr'). Moreover, if T is any equality 
theory, we have T h \/{a ^ a') . 



PROOF. Proved in [54, Theorem 1]. □ 

Theorem 49 Suppose a e RSubst. Then there exists a' e VSubst such that 
a I — >* a' and, for all t Q a' , t E VSubst. 
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PROOF. The proof is the same given for [54, Theorem 2], where a weaker 
result, using weak variable- idempotence, was stated. □ 

Theorem 50 Let T he an equality theory and a e RSubst. Then there exists 
a' e VSubst such that dom{a) — dom{a'), vars((7) = vars((j'), T h V((T <-> a') 
and for all t C. a', t & VSubst. 

PROOF. The result easily foUows from Theorems 48 and 49. □ 

The next result concerning a useful property of variable idempotent substitu- 
tions will be needed in Subsection B.6 for proving Theorem 19. 

Lemma 51 Let a G VSubst be satisfi.ahle in a syntactic equality theory T. Let 
s G HTerms fl GTerms and t G HTerms and suppose that T h V((j — > s = t) . 
Then s = ta. 

PROOF. By hypothesis, T h V((7 s ^ t) and s,t e HTerms so that we 
can apply Lemma 42 to obtain 



By Proposition 44, there exists i, j G N such that rt(s, a) = sa"^ and rt(t, a) = 
ta^ and dom(cr) fl vars(tcr-') = 0. Thus, if j = 0, we have ta^ — t — ta. 
On the other hand, if j > 0, as cr G VSubst, vars(t(j-') = vars(t(T) so that 
dom(cr) n vars(tcr) = and hence ta — taK As s G GTerms, vars(s) = so 
that s — sa\ Thus, by (B.17) we have s — ta. □ 

B.4 Some Results on the Groundness and Finiteness Operators 

The following proposition is proved in [54], and shows that the function 'gvars' 
precisely captures the intended property. 

Proposition 52 Let a G RSubst and x G Vars. Then 



When computing hvars((T) by means of the fixpoint computation given in 
Definition 11 on page 18, the fixpoint is reached after a single iteration if 
a G VSubst. 

Lemma 53 For each a G VSubst we have hvars((T) = hvarsi((7). 



rt(s, a) 



rt{t,a). 



(B.17) 
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PROOF. Wc show that hvars2(cr) C hvarsi((T). Let y e hvars2(cr). By Defi- 
nition 11, we have two cases: 

(1) if y e hvarsi((T) then there is nothing to prove; 

(2) assume now y e dom{a) and veirs{ya) C hvarsi((7). By Definition 11, we 

have two subcases: 

(a) vars(|/(T) C Vars \ dom{a). 

Then ya.Ts{ya) C hvarso (cr), so that y G hvarsi(cr); 

(b) V — vars(|/cr) n dom{a) ^ and, for all z &V, va.Ts{za) ndom((j) = 
0. 

Let z ^ V so that z G vars(yo'). By hypothesis, we have a G VSubst 
so that z G vars(i/o"0"). As 2; G dom(o') and vaxs{z<j) fl dom((T) = 0, 
z ^ vars(2;cr). This means that z ^ vars(|/(j(j), which is a contradiction 
since a G VSubst. 

□ 

Proposition 54 For each a G VSubst, we have 

hvars((T) = | y G Vars vars(y(7) fl dom((T) = |. 



PROOF. The result is obtained by applying Lemma 53 and then unfolding 
Definition 11. □ 



Proposition 55 Let a G VSubst and r G HTerms, where vars(r) C hvars(cr). 
Then 



rt(r, a) = ra, 
vars(rcr) fl dom(cr) — 0. 



PROOF. Suppose y G vars(r). Then, by Proposition 54, vars(|/(T)ndom((T) = 
0. Thus, for any i > 0, we have ya^ — ya & HTerms. Thus rt (y,cr) = ya. 
As this holds for all y G vars(r), it follows that rt(r, a) — ra and vars(rcr) fl 
dom(a) = 0. □ 



The following result shows that, for a variable-idempotent substitution, the 
finiteness operator precisely captures the intended property. 

Lemma 56 Let a G VSubst and y G Vars. Then 

Tt{y,a) G HTerms <^=^ y G hvars((7). 
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PROOF. Since a G VSubst, by Proposition 54 we have y e hvars((T) if and 
only if vars(|/cr) fl dom((j) = 0. 

Let vars(y(T) ndom(cr) — 0. Then, for any i > 0, we have ycr* — ya e HTerms. 
Hence rt(y, a) — ya E HTerms. 

In order to prove the other inclusion, let now Tt{y,(x) G HTerms. By Propo- 
sition 44, there exists an i G N such that rt{y,a) = ya"^ and vars(|/cr*) Pi 
dom((T) = 0. Since cr G VSubst, we have vars(y(T') = vars(y(7), so that 
vars(y(7) fl dom((T) = 0. □ 

In order to prove Proposition 13, i.e., to show that the finiteness operator 
precisely captures the intended property even for arbitrary substitutions in 
RSubst, we now prove that this operator is invariant under the application of 
S'-steps. 

Lemma 57 Let a, a' G RSubst where a \ — > a' . Then hvars(cr) = hvars(cT'). 
PROOF. Let {x ^t),{y ^ s) G a, where x ^ y, such that 



If X ^ vars(s) then we have a — a' and the result trivially holds. Thus, we 
assume x G vars(s). We prove the two inclusions separately. 

In order to prove hvars((7) C hvars((j') we show, by induction on m > 0, that 
we have 



For the base case, when m = 0, by Theorem 48 we have dom(cr) = dom{a') 
so that 



For the inductive step, when m > 0, assume hvars^_i((7) C hvarsTO_i((T') and 
let z G hvarsm(c)- By Definition 11, we have two cases: if 2; G hvarsm-i(c) 
then the result follows by a straight application of the inductive hypothesis; 
otherwise, we have 



a'=ia\{y 




hvars^ (cr) C hvaxSmi'j') ■ 



hvarso (cr) = Vars \ dom(o') 
= Vars \ dom((j') 
= hvarso (cr'). 



z G dom((T) A Yaxs{z(7) C hvars^_i((7). 
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Now, if 2; 7^ y we have za — za' , so that, by Theorem 48 and the inductive 
hypothesis we have 

z e dom((T') A vars(z(7') C hvars^_i((T'), 

so that, by Definition 11, z e hvarSr„((T'). Otherwise, ii z = then 

Y8kVs{z(T) — vars(s) 

C hvars^_i((7). 

Since, by hypothesis, x G vars(s), 

Y&xs{za') — vars^s{x 1— > i}^ 

— (^vars(s) \ {x}^ U vars(i), 

and we need to show vars(2;cr') C hvars^_i((7'). By the inductive hypothesis 
we have 

vars(s) C hvarsTO_i((T'); 

Note that, since x e vars(s), it follows x e hvars^_i (cr') so that, by Defini- 
tion 11, 

vars(i) C hvars^_2((7') 
C hvars„i_i((7'). 

In order to prove hvars((7) D hvars((T') we show, by induction on m > 0, that 
we have 

hvars„+i((T) D hvars„(cr'). 

For the base case, when m = 0, by Definition 11 and Theorem 48 we have 

hvarsi((j) ^ hvarso(cr) 

= Vars \ dom((7) 
= Vars \ dom((7') 
= hvarso((7'). 

For the inductive step, when m > 0, assume hvarSm(o") 13 hvarsTO-i(o'') and let 
z e hvars^(a"'). By Definition 11, we have two cases: if 2; e hvarSj„_i((7') then 
the result follows by the inductive hypothesis and by Definition 11; otherwise, 
we have 

z e dom((T') A Yais^za') C hvars^_i((T'). 
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Now, if 2; 7^ y we have za — za' , so that, by Theorem 48 and the inductive 
hypothesis we have 

z e dom((7) A vars(z(7) C hvars^(f7), 

so that, by Definition 11, z e hvars^+i (cr). Otherwise, if 2; = y, by definition 
of o"', the inductive hypothesis and Definition 11, we have 

— (vars(s) \ {x}^ U vars(i) 
C hvars^_i((7') 
C hvarSm(o") 
C hvars^+i((7). 

Also note that we have 

vars(x(T) = vars(t) 

C hvarsTO((7) 

so that, by Definition 11 we have 

X e hvarsm+i(c)- 
The result follows by observing that 

vars(2;cr) = vars(s) = (^vars(s) \ {x}^ U {x}. 

□ 

Lemma 58 Let a, a' G RSubst, where a 1 — >* a'. Then hvars(cr) = hvars(cr'). 

PROOF. By induction on the length n > of the derivation. For the base 
case, when n = 0, there is nothing to prove. Suppose now that 

(7 = (To I • • • I (7„_i I )• (7„ = Cr', 

where n > 0. By the inductive hypothesis, since the derivation a 1 — >* 

has length n — 1, we have hvars((7) = hvars((Tn_i). Then the thesis follows by 

Lemma 57. □ 

Proof of Proposition 13 on page 18. By Theorem 50, there exists a' e 
VSubst such that a 1 — >* a' and, for all equahty theories T,T \- \/{a <-> a'). By 
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Lemma 56, for all x e Vars, rt(,T,cr') e HTerms if and only if x G hvars((j'). 
By Lemma 58, we have hvars(a) = hvars((T') and, by Proposition 45, for all 
X e Vars, rt(a;, a') G HTerms if and only if rt(a;, a) G HTerms. Therefore, for 
any x G Vars, it{x, a) G HTerms if and only if a; G hvars((j) □ 

Proof of Proposition 15 on page 18. We prove the two statements (15a) 
and (15b) separately. 

(15a). By hypothesis, r G ia. Thus, by Proposition 2, TZT h V(r a). 
Suppose X G hvars(r). Then, by Proposition 13, wc have rt(x, r) G HTerms. 
Therefore we can apply Proposition 45 to obtain rt(a;, a) G HTerms and hence, 
by Proposition 13, x G hvars((T). 

(15b). Suppose X G hvars((T) fl gvars((T). Then, by Propositions 13 and 52, 
Tt{x,a) G GTerms n HTerms. Thus, by case (44b) of Proposition 44, there ex- 
ists i G N such that rt(a;, a) = xa^ and also vars(a;(T*) = 0. Thus Tt{xa\ r) = 
xa\ Since by hypothesis we have r G [a, by Lemma 40 and transitivity 
we obtain that TZT h Vfr ^ {x — xa^)). Thus, by Lemma 42, rt(a;, r) = 



rt(x(7*, r) = xcr*. Therefore, by Propositions 13 and 52, x G gvars(T)nhvars(T). □ 

Proposition 59 Let a,T ^ RSubst be satisfiable in a syntactic equality theory 
T andW C Vars, where T h y{3W . a^3W.T). Then 



PROOF. Suppose z G hvars(o') \ W. By Proposition 13, rt(z, o") G HTerms 
and hence, by Proposition 45, rt(^, r) G HTerms. Therefore, by Proposition 13, 
z G hvars(r). 

The reverse inclusion follows by symmetry. □ 

Corollary 60 Let e C Eqs be satisfiable in the syntactic equality theory T . If 
(7, T G mgs(e); then hvars(f7) = hvars(T). 

PROOF. By definition of mgs, we have cr, r G RSubst, T h V(cr e) and T h 
V(t ^ e) so that T h V((T r). Thus the result follows by Proposition 59. □ 

B.5 Abstracting Finiteness 



Proof of Theorem 17 on page 19. By Definition 16, (cr) = hvars((T)nVl 
and anir) ~ hvars(T) fl VL The result is a simple consequence of Proposi- 



hvars((T) \ W = hvars(T) \ W. 
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tion 59, since TZT is a syntactic equality theory, cr, r G RSubst are satisfiable 
in TZT and, by hypothesis, TZT h V(cr r). □ 



B.6 Correctness of Abstract Unification on H x P 

For the rest of the appendix it is assumed that the equahty theory TZT holds. 
Note that this means that the congruence and identity axioms hold and also 
that every substitution in RSubst is satisfiable in TZT. 

Lemma 61 Let s = (si, . . . , s„) G HTerms" be linear, and suppose the tuple 
of terms t — {ti, . . . ,tn) & HTerms" is such that vars(s) fl nlvars(i) = and 
mgs(s — t)^0. Then there exists /i G mgs(s = t) such that, for each variable 
z G dom(//) \ (vars(s) fl vars(i)); we have vars(2;/x) fl dom(/x) = 0. 




PROOF. The proof is by induction on the number of variables in vars(s) U 
vars(f). 

Suppose first that, for some i = 1, . . . , n, we have Sj = /(ri, . . . ,rm) and 
U = f{ui, . . . , Urn) (with m > 0). Let 



Then mvars(s') = mvars(s) and mvars(t') = mvars(t) so that, as s is linear, s' 
is linear, vars(s') fl nlvars(f ) = and vars(s') fl vars(t') = vars(s) fl vars(f). 
Moreover, by the congruence axiom (B.4), mgs(s' = t') = mgs(s = i). We 
repeat this process until all terms in s' and t' can not be decomposed any 
further. (Note that in the case that Sj and tj are identical constants, we can 
remove them from s' and F, since the corresponding equation Si = ti holds 
vacuously.) Thus, as s and t are finite sequences of finite terms, we can assume 
that, for alH = 1, . . . , n, either Sj G Vars or ti G Vars. 

Secondly, suppose that for some i = 1, n, Si — ti. By the previous para- 
graph, we can assume that Si G Vars. Let 



Then mvars(si) U {si} = mvars(s) and mvars(ti) U {sj} = mvars(f) so that, as 
s is linear, Si is linear, vars(sj) n nlvars(tj) = and 





Si, . . . , Ti, . . . , r^, . . . , Sjjj, 

ti, . . . , ti_i, Ui, . . . , ■ ■ ■ ) ^n)- 



Si — (^Si, . . . , . . . , Sjjj, 

ti (ti, . . . , ti—i, tj+l, . . . , tfi)- 
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As s is linear and vars(s) fl nlvars(f) = 0, Si ^ vars(si) U vars(ii) and hence, 
for all /J, e mgs(s — i), we have Sj ^ dom(/x). Therefore 

dom(/x) \ (^vars(s) Pi vars(t)^ = dom(/x) \ (^vars(si) nvars(ti)). 

Furthermore, by the congruence axiom (B.l), mgs(si = U) = mgs(s = t). 
Thus, as s and t are sequences of finite length n, we can assume that Si ^ ti, 
for alH = 1, . . . , n. 

Therefore, for the rest of the proof, we will assume that for each i = 1, . . . , n, 
Si 7^ ti and either Sj e Vars or ti e Vars. 

For the base case, we have vars(s) U vars(f) = and the result holds. 

For the inductive step, vars(s) U vars(t) 7^ so that n > 0. As the order of 
the equations in s = i is not relevant to the hypothesis, we assume, without 
loss of generality that if, for some i — 1, . . . , n, vars(si) fl vars(ii) = 0, then 
vars(si) n vars(ii) = 0. There are three cases we consider separately: 

a. for alH = 1, . . . , n, vars(sj) Pi vars(ti) 7^ 0; 

b. Si e Vars \ vars(ii); 

c. ti e Vars \ vars(si). 

Case a. For all i = 1, . . . , n, vars(si) fl vars(tj) 7^ 0. 

For each i — 1, . . . , n, we are assuming that either Sj e Vars or e Vars, 
Therefore, for each i — 1, . . . , n, Si & vars(ti) or U e vars(si) so that, without 
loss of generality, we can assume, for some k, where < A; < n, Sj e Vars if 

I < i < k and U G Vars if A; + 1 < i < n. 

Let 

A* {Sl — h, . . . , Sk — tk} U {tk+l — Sk+l, . . .,tn — Sn}- 

We now show that n C Eqs is in rational solved form. As s is linear, (si, . . . , s^) 
is linear. As s is linear and ti e vars(si) ii k + 1 < i < n, then {tk+i, . . . ,tn) 
is linear and {si, . . . , Sfe} fl {tk+i, ■ ■ ■ , tn} = 0. As we are assuming that, for 
alH = 1, . . . , n, Sj 7^ tj and vars(sj) fl vars(tj) 7^ 0, it follows that ti ^ Vars 
when 1 < i < k and Sj ^ Vars when A; + 1 < i < n, so that each equation in 

II is a binding and n has no circular subsets. Thus /i e RSubst and hence, by 
the congruence axiom (B.2), ^ e mgs(s = t). 

As Si G vars(tj) when 1 < i < k and ti G vars(sj) when k + 1 < i < n, 
dom(/i) \ (^vars(s) fl vars(f)^ = 0. Therefore the required result holds. 

Case b. Si G Vars \ vars(ti). 
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Let 

_ dcf / N 

^ (B.18) 

^1 = (^2{Sl ^ h}, tn{Si ^ ti}J . 

As s is linear, si ^ vars(si). Also, all occurrences of si in t are replaced in ti 
by ti so that, as Si ^ vars(fi), Si ^ vars(ii). Thus 

si ^ vars(si) U vars(ti). (B.19) 

Therefore vars(si) U vars(ti) C vars(s) U vars(t). Now since s is linear, si is 
linear. Thus, to apply the inductive hypothesis to Si and ti, we have to show 
that 

vars(si) n nlvars(ti) = 0. (B.20) 
Suppose that u G vars(si) so that u G vars(s). Now, by hypothesis, we have 
vars(s) nnlvars(f) = 0. Thus Si,u ^ nlvars(r). If u G vars(^(t2, • • • , ^n)) so that 
u ^ vars(ti), then u ^ nlvars(ti). On the other hand, if -u ^ vars(^(t2, • • • j^n))) 

then, as Si ^ nlvars(^(t2, • • • , ^n)) ^^'^ ^ nlvars(ti), -u ^ nlvars(ti). Thus, 
for all u G vars(si), u ^ nlvars(ti). Hence (B.20) holds. It follows that the 
inductive hypothesis for si and ti holds. Therefore there exists Hi G RSubst 
where 

III G mgs(si = ti) 

such that, for each z G dom(//i)\^vars(si)nvars(ii)^, vars(2;//i)ndom(/xi) = 0. 
Let 

/i = {si =ti^i}U/xi. (B.21) 

We now show that C Eqs is in mgs(s = t). First we show that /x is in rational 
solved form. By (B.19), 

si ^ vars(/xi), (B.22) 

and, as si ^ vars(ti), we have 

Si ^ vars(ii//i). (B.23) 

Thus, as Hi G RSubst, fi has no identities or circular subsets so that /x G 
RSubst. By Lemma 41, fx G mgs(s = i). 

Let 

z G dom(//) \ (vars(s) fl vars(f)). (B.24) 
Then we have to show that 

vars(2;/x) n dom(//) = 0. (B.25) 
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It follows from (B.21) and (B.24) that either z e dom(//i) so that z// = z/zi 
or z = si and z// = ii/^i- We consider these two cases separately. 

Suppose first that z e dom(/ii). By (B.18), we have both vars(si) C vars(s) 
and vars(ii) C vars(f), so that vars(si) n vars(ii) C vars(s) fl vars(t). Hence 
we have z e dom(/xi) \ ^vars(si) nvars(ti)^. Thus we obtain, by the inductive 
hypothesis, \w^{z\x\) ndom(yUi) = 0. Now, as z G dom(/xi) and (B.22) holds, 
Si ^ vars(z/ii). Thus, as dom(/i) = dom(/Xi) U {si}, vars(^/Xi) fl dom(/x) = 0. 
Hence, as z\i = z^i, (B.25) holds. 

Secondly suppose that z = Si. Then we have that Si ^ vars(s) n vars(t). 
Hence ii = {t2, ■ ■ ■ ,tn). Let u be any variable in vars(ti). Then we have that 
u ^ vars(si) fl vars(ti), since vars(s) fl nlvars(t) = 0. If -u G dom(yUi), then we 
can apply the inductive hypothesis to obtain vars(M/xi) ndom(/xi) = 0. On the 
other hand, if it ^ dom(//i), we have u — u/ii and vars(M/xi) fl dom(/i,i) — 0. 
Hence vars(ti/ii) n dom(/xi) = 0. Thus, as dom(//) = dom(/ii) U {si}, by 
(B.23), vars(ii//i) fl dom(//) = 0. Therefore, as z/j, — tijii, (B.25) holds. 

Case c. ti G Vars \ vars(si). 

Let 



(B.26) 



Sl {S2{tl t-> Si}, . . . , Sn{tl ^ Si}), 
^1 =^ (^2{^1 ^ Si],...,tn{ti ^ Sl}). 



All occurrences of ti in s and t are replaced in si and ti by Si so that, since 
ti ^ vars(si), 

ti ^ vars(si) U vars(ti). (B.27) 

Therefore vars(si) U vars(ti) C vars(s) U vars(f). Now, si is linear since s is 
linear. Thus, to apply the inductive hypothesis to Si and ti, we have to show 
that 

vars(si) n nlvars(ii) = 0. (B.28) 

Suppose u is any variable in vars(si). Then either u G vars^(s2, . . . , s^)) or 
we have u G vars(si) and ti G vars((s2, . . . , s„)). By hypothesis, vars(s) Pi 



nlvars(i) = 0, so that u ^ nlvars(f). If G vars(^(s2, . . . , Sn)j , then, as s 
is linear, u ^ vars(si). Thus, it follows from (B.26) that u ^ nlvars(ti). If 
t\ G vars^(s2, ■ ■ ■ , Sn)), then we have t\ ^ vars(^(i2, ■ ■ ■ , ^n)) so that, again 
by (B.26), ti = (t2, • • • ,tn)- Thus, for all u G vars(si), u ^ nlvars(ti). Hence 
(B.28) holds. It follows that the inductive hypothesis for Si and ti holds. 
Therefore there exists G RSubst where 

III G mgs(si = ti) 



62 



such that, for each z e dom(//i) \ (^vars(si) fl vars(ii)^, we have vars(2;//i) fl 
dom(//i) = 0. 

Let 

= Oi = Si/Xi}U/Xi. (B.29) 

We now show that // C Eqs is in mgs(s = f). First we show that // is in rational 
solved form. By (B.27), 

^ vars(/xi), (B.30) 

and, as t\ ^ vars(si), we have 

i\ ^ vars(si/xi). (B.31) 

Thus, as ii\ G RSubst, ii has no identities or circular subsets so that e 
RSubst. By Lemma 41, e mgs(s = f). 

Let 

z e dom{n) \ (vars(s) n vars(f)). (B.32) 
Then we have to show that 

vars(2;/x) fl dom(/x) = 0. (B.33) 

It follows from (B.29) and (B.32) that either z G dom(/xi) so that z/j, — Zjii 
z = ti and zfi = si/ii. We consider these two cases separately. 

Suppose first that z G dom(//i). To apply the inductive hypothesis to z, we 
need to show that, 

vars(si) n vars(ii) C vars(s) fl vars(f). 

To see this, let us suppose u G vars(si) fl vars(ti). Then, by (B.26), either 
we have u G vars(^(s2) • • • i s„)) or -u G vars(si) and ti G vars(^(s2, . . . , Sn))- If 
u G vars(^(s2i • • • i s„)^, then u G vars(s) so that, as s is linear, we have also 
u ^ vars(si) and hence u G vars(^(t2, • • • , ^n)) • Alternatively, if m G vars(si) 
and ti G vars(^(s2, • . . , s„)^, then -u, ti G vars(s). Moreover, by hypothesis, 

vars(s) n nlvars(r) = 0, so that ti ^ vars(^(t2, • • • , ^n))- Thus ti = (t2, ■ ■ ■ ,tn) 
and hence -u G vars(f). Therefore, in both cases, u G vars(s) nvars(r). It follows 
that z G dom(/i,i) \ (^vars(si) nvars(ti)^. Thus, by the inductive hypothesis, 
we have vars(2;/Xi) fl dom(/xi) = 0. Now, as 2; G dom(/xi) and (B.30) holds, 
ti ^ vars(2;//i). Thus, as dom(//) = dom(//i) U {^i}, vars(2;//i) fl dom(//) = 0. 
Hence, as z/i — z/ii, (B.33) holds. 

Secondly, suppose that z = ti. Then ti ^ vars(s) fl vars(f) and, consequently, 
si — {s2, • • • , Sji)- Let u be any variable in vars(si). Then, as s is linear, we 
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have u ^ vars(si) so that u ^ vars(si) fl vars(ti). Thus, if u G dom(/ii), we can 
apply the inductive hypothesis to u and obtain vais{ufii) fl dom(/xi) = 0. On 
the other hand, if m ^ dom(/xi), u = Ufii and vars(M/i.i) ndom(/ii) = 0. Hence 
vars(si/ii) n dom(/ii) = 0. Thus, as dom(//) = dom(/xi) U {ti}, by (B.31), 
vars(si//i) n dom(//) = 0. Therefore, as zji — sijii, (B.33) holds. □ 

Lemma 62 Suppose that the tuple of terms s =^ (si, . . . , s„) G HTerms"' is 
linear, t =^ (ti, • • • ,trt) G HTerms" an(i mgs(s = t) 7^ 0. Then there exists 
fl G mgs(s = t) and, for each z G dom(/x) \ vars(s), the following properties 
hold: 

(1) vais(zfi) C vars(s); 

(2) vars(2;/x) n dom(/x) = 0. 

PROOF. The proof is by induction on the number of variables in vars(s) U 
vars(f). 

Suppose first that, for some i — 1, . . . , n, we have Si — /(ri, . . . , r^) and 
k = fiui, Um) (m > 0). Let 

_/ def / N 
S ■ ■ ■ ) "Sj— 1) r\^ • ■ ■ ) ^m; "Sj+l; • • • ) "SnJ; 

t (^1, ■ ■ ■ ) ti—i, 111, ■ ■ ■ 1 '^mi ■ ■ ■ 1 tn}- 

Then mvars(s') = mvars(s) and mvars(t') = mvars(f) so that, as s is linear, s' 
is linear. Moreover, by the congruence axiom (B.4), mgs(s' = t') = mgs(s = t). 
We repeat this process until all terms in s' and i' can not be decomposed any 
further. (Note that in the case that Sj and ti are identical constants, we can 
remove them from s' and t', since the corresponding equation Si — ti holds 
vacuously.) Thus, as s and t are finite sequences of finite terms, we can assume 
that, for all i = 1, . . . , n, either Sj G Vars or ti G Vars. 

Secondly, suppose that for some i — 1, n, Si — ti. By the previous para- 
graph, we can assume that Si G Vars. Let 

_ dcf / \ 
Si . . . ; Si— I, "Sj-l-l, . . . , Syi), 

ti (ti, . . . , ti—i, ti^i, . . . , ^n). 

Then mvars(si) U {si} — mvars(s) and mvars(ii) U {si} — mvars(i) so that, as 
s is linear, Sj is linear. Therefore 

dom(/x) \ vars(s) C dom(/i) \ vars(si). 

Furthermore, by the congruence axiom (B.l), mgs(sj = ti) = mgs(s = t). 
Thus, as s and i are sequences of finite length n, we can assume that Si ^ ti, 
for alH = 1, . . . , n. 
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Therefore, for the rest of the proof, we will assume that Si ^ U and either 
Si e Vars or ti e Vars, for all i = 1, . . . , n. 

For the base case, we have vars(s) U vars(f) = and the result holds. 

For the inductive step, vars(s) U vars(t) 7^ so that n > 0. As the order of 
the equations in s = t is not relevant to the hypothesis, we assume, without 
loss of generality that if, for some i — 1, . . . , n, vars(si) fl vars(ti) = then, 
we have vars(si) fl vars(ii) = 0. There are four cases we consider separately: 

a. for all i = 1, . . . , n, vars(sj) Pi vars(tj) 7^ 0; 

b. Si e Vars \ vars(ti); 

c. ^1 e Vars \ vars(s) and Si ^ Vars; 

d. tl e vars(s) \ vars(si) and si ^ Vars. 

Case a. For alH = 1, . . . , n, vars(si) fl vars(ti) 7^ 0. 

For each i = 1, . . . , n, we are assuming that cither Si G Vars or ti G Vars, 
Therefore, for each i = 1, . . . , n, Sj G vars(tj) or ti G vars(sj) so that, without 
loss of generality, we can assume, for some k, where < k < n, Si E Vars if 
1 < i < k and ti G Vars if A; + 1 < i < n. 



Let 



fj, — {Si — tl, . . . , Sfc — tk} U {tk+l — Sk+l, . . . ,tn — Sn}. 



We show that //. C Eqs is in mgs(s = i). First we must show that G RSubst. 
As s is linear, (si, . . . , Sk) is linear. As s is linear and ti G vars(sj) if /c + 1 < 
i < n, then {tk+i, . . . ,tn) is linear and {si, . . . , Sk} n {tk+i, ■ ■ ■ , tn} — 0. As 

we are assuming that, for all i = 1, . . . , n, Si 7^ ti and vars(si) fl vars(tj) 7^ 0, 
it follows that tj ^ Vars when 1 < i < k and Sj ^ Vars when k + 1 < i < n, 
so that each equation in is a binding and /i has no circular subsets. Thus 
/i G RSubst and hence, by the congruence axiom (B.2), /x G mgs(s = i). 

As {tk+i, . . . , tn} C vars(^(sfc+i, . . . , we have dom(//)\vars(s) = 0. There- 
fore the required result holds. 

Case b. si G Vars \ vars(ii). 

Let 

_ def / \ 
Si = [S2, . . . ,S„), 



tl = (t2{Si tl}, . . .,tn{si tl}). 



As s is linear, Si is linear and Si ^ vars(si). Also, all occurrences of Si in t are 
replaced in ii by ti so that, as Si ^ vars(ti) (by the assumption for this case), 
si ^ vars(ti). Thus 

si ^ vars(si) U vars(ti). (B.34) 
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It follows that vars(si) U vars(ti) C vars(s) U vars(f) so that the inductive 
hypothesis applies to Si and ti. Thus there exists //i e RSubst where 

III e mgs(si = ti) 

such that, for each z e dom(/xi) \ vars(si), properties 1 and 2 hold using /xi 
and Si. 

Let 

fJ' = {Sl= tlyUl} U^i. 

We show that fi C Eqs is in mgs(s = i). By (B.34), we have Si ^ vars(/ii) so 
that si ^ dom(/ii). Also, since /^i G RSubst, ii has no identities or circular 
subsets. Thus we have /j, e RSubst. By Lemma 41, // e mgs(s = t). 

Suppose that z e dom(/i) \ vars(s). As 

vars(si) U {si} = vars(s) 

and 

dom(/Xi) U {si} = dom(/x), 

we have 

dom(/ii) \ vars(si) = dom(yLi) \ vars(s). (B.35) 

Therefore z G dom(/ii)\vars(si) and zfii = zfi. Thus the inductive properties 1 
and 2 using /^i and si can be applied to z. We show that properties 1 and 2 
using fi and s can be applied to z. 

(1) By property 1, vaxs{zfi) C vars(si) and hence, vars(z/i) C vars(s). 

(2) By property 2, wc have vars(2;^) fl dom(/ii) = 0. Now Si ^ vaxs{zfi) 
because Si ^ vars(si) (since s is linear) and vars(2;/x) C vars(si) (by 
property 1). Thus, as dom(/x) = dom(/xi) U {si}, we have vars(2;/x) fl 
dom(//) = 0. 

Case c. Assume that ti G Vars \ vars(s) and si ^ Vars. 
Let 

- 4£f / ^ 

•^1 — [S2, . . . ,Sn), 

h = (t2{ti Si}, . . .,tn{ti ^ Si}). 

As s is linear, si is linear. By the assumption for this case, ti ^ vars(si). Also, 
all occurrences of ti in t are replaced in ti by si so that ti ^ vars(fi). Thus 

ti ^ vars(si) Uvars(ti). (B.36) 
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It follows that vars(si) U vars(ti) C vars(s) U vars(t) so that we can apply the 
inductive hypothesis to Si and ti. Thus there exists /ii e RSubst where 

fii E mgs(si = ti) 

such that, for each z G dom(/ii) \ vars(si), properties 1 and 2 hold using Hi 
and si. Note that, by (B.36), ti ^ vars(//i) and, in particular, ti ^ dom(//i). 

Let 

jj =^ {ti = sifii} U fii. (B.37) 

As Si ^ Vars and /^i e RSubst, n G Eqs has no identities or circular subsets 
so that iJ, G RSubst. By Lemma 41, /j, G mgs(s = t). 

As ti G dom(//) (by (B.37)) and ti ^ vars(s) (by the assumption for this case), 
we have 

dom(/ii) \ vars(si) U {ti} = dom(/i) \ vars(s). 

Suppose that z G dom(/i) \ vars(s). Then either z ^ ti so that z^ = zfii and 
the inductive properties 1 and 2 using ni and Si can be applied to z or z = ti 
and z/i — si/ii. We show that properties 1 and 2 using and s can be applied 
to z. 

(1) Suppose z ^ ti so that = zfii. Using property 1, vais{zfii) C vars(si). 
As vars(si) C vars(s), it follows that va.Ts{zfi) C vars(s). 

Suppose that z = ti so that z/i — siHi. Let u be any variable in si. As 
s is linear, u ^ vars(si). Thus, if u G dom(/ii), we can use property 1 to 
derive that vars(n/ii) C vars(si). If u ^ dom(/xi), then ufii = u so that 
vars(ti/ii) C vars(si). Moreover vars(si) Uvars(si) = vars(s) so that 

vars(si/ii) C vars(s). (B.38) 

Hence vars(^/x) C vars(s). 

(2) Suppose z ^ ti so that zfi = zfii. Then, as property 2 holds, we have 
vais{zfi) n dom(yUi) = 0. Now ti ^ vais{zfi) because vars(2;/i) C vars(si) 
(by property 1) and ti ^ vars(si) (by (B.36)). Thus, as dom(/i) — 
dom(/xi) U {^i}, we have vars(2;//) fl dom(//) = 0. 

Suppose that z — ti so that z/x — Si/ii. Let m be any variable in 
vars(si). Then, as s is linear, u ^ vars(si). Then either u G dom(/ii), 
and we can apply property 2 to m to obtain vars(M/xi) Pi dom(/i,i) = 0, 
or u = M/xi, and vars(M/ii) fl dom(/ii) = 0. Hence we have vars(si/ii) n 
dom(//i) = 0. Now ti ^ vars(si//i) because vars(si^i) C vars(s) (by 
(B.38)) and ti ^ vars(s) (by the assumption for this case). Thus, as 
dom(/i) = dom(/xi) U {ti}, we have vars(2;/i) fl dom(/i) = 0. 

Case d. Assume that G vars(s) \ vars(si) and Si ^ Vars. 
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Let 



def 



Si = {S2{tl ^ Si}, . . . , S„{ti Si}), 
tl = (t2{ti Si}, . . . , tn{tl ^ Si}) . 



As s is linear, there is only one occurrence of ti in {s2, . . . , s„}, and, in Si, this 
is replaced by Si which is also linear. Thus Si is linear, Si C s and ti ^ vars(si). 
Also, all occurrences of ti in t are replaced in ii by Si so that ti ^ vars(ti). 
Thus 

h ^ vars(si) U vars(ii). (B.39) 
It follows that vars(si) Uvars(ti) C vars(s) Uvars(f) so that we can apply the 
inductive hypothesis to si and ii. Thus, there exists /ii e RSubst where 

Hi e mgs(si = tl) 

such that, for each z e dom(/ii) \ vars(si), properties 1 and 2 hold using /ii 
and Si- 

Let 

= {ti = sifii} U fii. 
By (B.39), tl ^ vars(;Lii). Moreover fii G RSubst and Si ^ Vars so that 
11 e Eqs has no identities or circular subset. Thus /x G RSubst. By Lemma 41, 
/X e mgs(s = t). 

As vars(si) U {ti} = vars(s) and dom(/xi) U {ti} = dom(/i), we have 

dom(^i) \ vars(si) = dom(^) \ vars(s). 

Suppose z G dom(/x) \ vars(s). Then z ^ ti, z/i = zfii and the inductive 
properties 1 and 2 using iii and si can be applied to z. We show that the 
properties 1 and 2 using and s can be applied to z. 

(1) By property 1, vars(^/x) C vars(si) and hence, as Si C s, vars(^/x) C 
vars(s). 

(2) By property 2, we have vars(2;//) fl dom(/ii) = 0. Now ti ^ YBxs{zjj) 
because ti ^ vars(si) (by (B.39)) and vars(2;/x) C vars(si) (by property 1). 
It follows that y&xs{zh) fl dom(//) = 0, since dom(//i) U {ti\ — dom(//). 

□ 



Proposition 63 Letp G F anrf (,t i— > t) G Bind, where {,x'}Uvars(t) C VI. Let 
also o G "yp{p) n VSubst and suppose that {r, r'} = {x,t}, vars(r) C hvars(cr) 
and rt(r, a) G GTerms. T/ien, /or all r G mgs(^cr U {a; = t}), tfe /iave 

hvars((7) U vars(r') C hvars(T). (B.40) 
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PROOF. If a U {x = t} is not satisfiable, the result is trivial. Wc therefore 
assume, for the rest of the proof, that a (J {x = t} is satisfiable in TZT. It 
follows from Corollary 60 that we just have to show that 

(1) vars(r') C hvars(T), for some r e mgs^^cr U {x — t}); 

(2) hvars((T) C hvars(T), for some r e mgs^^cr U {x — t}^. 

From these, we can then conclude that, for all r e mgs^^cr \J {x — t}^, (B.40) 
holds. 

Note that, in both cases, since a e VSubst and vars(r) C hvars(cr), by Propo- 
sition 55 we have rt(r, a) — ra, so that ra e HTerms fl GTerms. 

We first prove statement 1. We must show that there exists r e m.gs{a\J{x — 
i}^ such that vars(r') C hvars(T). 



As mgs(^(T U {a; = t}j ^ 0, by Theorem 50 and the definition of mgs we can 
assume that there exists r e VSubst fl mgs(a U {x = t}^ . Thus 

r =^ (a[J{r = r'}^. 

By Lemma 40 and the congruence axioms, we have r =^ {ra — r'}. Since 
T e VSubst and ra e HTerms fl GTerms, Lemma 51 applies (with s = ra) 
so that ra — r'r e HTerms fl GTerms. Thus, by Proposition 54, vars(r') C 
hvars(r). 

We now prove statement 2. In this case, we show that there exists r e mgs^^crU 
{x — i}^ such that hvars(cr) C hvars(T). 

Let 

{«!, . . . , M;} =^ dom{a) n vars(r'(T), 

- def / \ 

s = {ui,...,ui,ra), 



T def / / \ 

t = [uia, . . .,uia,r a). 

By Lemma 41 and the congruence axioms, a U {x = t} =^ s = i. Thus, as 
a\J{x — t} is satisfiable in TZT, mgs(s — t)y^0. Then, by Theorem 50, there 
exists /i G VSubst H mgs(s = t). Therefore, since ra G HTerms n GTerms and 
H {ra = r'a}, Lemma 51 applies (with s = ra) so that we can conclude 

ra = r'a II G HTerms n GTerms. Hence, for all w G dom(/x), 

vars(w/x) = 0. (B.41) 
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Let 




Then, as cr, e RSubst, it follows from (B.41) that i/, r e Eqs have no identi- 
ties or circular subsets so that i/, r e RSubst. By Lemma 41, r e mgs((TU{x = 



Suppose that y G hvars(a). Then we show that y G hvars(r). Using Proposi- 
tion 55, rt(|/, cr) = ya and 



We show that vars(yT) fl dom(r) = 0. Now, if y ^ dom(r), the result holds 
trivially. Suppose that y G dom(z/), then yr = ya/j and y G dom((T). Let w be 
any variable in vars(|/cr) so that, by (B.42), w ^ dom(o"). If -u; ^ dom(/i), then 
w = wfi ^ dom(r). If w G dom(/x), then, by (B.41), vars(w/i) = 0. Therefore, 
vars('u;/x)ndom(r) = 0. It follows that vars(|/z/)ndom(r) = 0. Finally, suppose 
y G dom(//). Then, by (B.41), vars(y/i) = 0. Therefore vars(y//)ndom(T) = 0. 

Therefore, using Definition 12, we have that y G hvars(r) as required. □ 

Proposition 64 Let p E P and (x i— > i) G Bind, where {x} U vars(t) C VI. 
Let also a G 7p(p) n VSubst anc? suppose that x G hvars((T) and vars(t) C 
hvars((T). Suppose also that indp(a;, t) and that orJ!^-n.p{x,t) hold. Then, for all 
substitutions r G mgs(cr U {x — t}\, 



PROOF. If (T U {a: = t} is not satisfiable, the result is trivial. We therefore 
assume, for the rest of the proof, that a U {x = t} is satisfiable in TZT . It 
follows from Corollary 60 that we just have to show that there exists r G 
mgs((7 U {x = t}) such that (B.43) holds. 



As a; G hvars(o') and vars(t) C hvars((j), by using Proposition 55 we obtain 
rt(a;, a) = xa and rt(t, a) = ta. Also 




vaxs{ya) fl dom{a) — 0. 



(B.42) 



hvars((j) C hvars(r). 



(B.43) 



vaxs{xa) fl dom(a") = 0, 
vars(i(T) fl dom{a) — 0. 



(B.44) 



As mdp{x,t) holds. 



vars(a;(7) fl vars(t(T) — 0. 



(B.45) 
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By hypothesis, orJin(a;, t) holds so that, by Definition 8, for some r e {x,t}, 
ra is hnear. Let r' =^ {x, t} \ {r}. 

By Lemma 41 and the congruence axioms, a U {x = t} =^ {ra — r'a}. 
Thus, as a\J {x — t} is satisfiable in TZT, mgs{ra — r'a) ^ Si. Thus we can 
apply Lemma 61 (where s = ra and t = r'a) so that, using (B.45), there exists 
II e mgs{xa — ta) such that, for all w e dom(/x). 



Note that, by (B.44), 



Let 



vars(w//) n dom(//) = 0. (B.46) 
dom(cT) n vars(/i) = 0. (B.47) 



1/ =^ I 2; = zajji z e dom((T) |, 

def I . 
T — V\J jl. 

Then, as cr, // e RSubst, it follows from (B.47) that i/, r e Eqs have no identi- 
ties or circular subsets so that i/, r e RSubst. By Lemma 41, r e mgs(^(TU{a: = 

0). 

Suppose y G hvars((T). Then we show that y G hvars(r). As 7/ G HTerms, we 
have, using Proposition 55, rt(|/, a) = ya and 

vars(y(7) n dom(a) = 0. (B.48) 



We show that vars(yr) ndom(T) — 0.1iy ^ dom(T), the result holds trivially. 
Suppose that y G dom(z/), then yr = ya^. Let w be any variable in va.Ts{ya). 
Then, by (B.48), w ^ dom{a). If w ^ dom(/i), then w = wfi ^ dom(r). If w G 
dom(/i), then vars(u'/i.) C vars(/i) so that, by (B.47), vars(u'/i.) ndom(i/) = 0. 
Moreover (B.46) applies so that vars('u;//) fl dom(//) = 0. Therefore we have 
vars(u'/i)ndom(r) = 0. It follows that vars(i/z/)ndom(T) — 0. Finally, suppose 
y G dom(/i). Then = y^ and, by (B.47), we have vars(i//x) fl dom(z/) = 0. 
Also (B.46) applies where w is replaced by y so that vars(7//x) fl dom(/x) = 0. 
Thus vars(j//i) fl dom(r) = 0. 

Therefore, using Definition 12, we have that y G hvars(T) as required. □ 

Proposition 65 Letp G P and {x ^ t) E Bind, where {a;}Uvars(t) C VI. Let 
also a G 7p(]9)nVSubst and suppose thatx G hvaxs{a) and\a.Ts{t) C hvars((7). 
Suppose also that gfreep(x) anc? gfreep(i) /loM. T/ien, /or a/Z r G mgs^crU {x = 

t}^; we have 

hvars(cr) C hvars(T). (B.49) 
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PROOF. If a U {x = t} is not satisfiable, the result is trivial. Wc therefore 
assume, for the rest of the proof, that a U {x = t} is satisfiable in TZT. It 
follows from Corollary 60 that we just have to show that there exists r e 
mgs((7 U {x = O) such that (B.49) holds. 

By Definition 8, gfreep(x) and gfreep(t) imply that either Tt{x,a) G GTerms 
or rt(,T,a) G Vars, and either Yt(t,a) G GTerms or rt(t, a) G Vars. Since 
we have rt(a;, cr), rt(t, cr) G HTerms and a G VSubst, as a consequence of 
Proposition 55, we have rt(a;,cr) = xa, rt(t, cr) ~ ta and xa,ta ^ dom(a). 
There are three cases: 

• vars(a;cr) = V vars(to') = 0. Then the result follows from Proposition 63. 

• xa = ta & Vars. Then letting t = a gives the required result. 

• xa,ta G Vars are distinct variables. Let t — a U {xa — ta}. Then, as 
xa,ta ^ dom((7), r G RSubst. Hence, by Lemma 41, r G mgs^^crU {x — t}^. 
Let y be any variable in hvars((7). We show that y G hvars(r). 

Suppose first that y ^ xa. Then yr = ya. Thus using Proposition 55, 
Tt{y, a) = yr and vars(i/r) fl dom(a) = 0. Thus vars(i/r) fl dom(r) C {xa}. 
However, xar = ta ^ dom(r) so that, by Definition 11, vars(|/r) C hvarsi(r) 
and hence y G hvars2(r). Therefore, by Definitions 11 and 12, we have 
y G hvars(r). 

Secondly, suppose that y = xa. Then yr = ta. So that, as ta G Vars \ 
dom(cr) and xa ^ ta, vars(?/r)ndom(T) = 0. Therefore, using Definition 12, 
we have that y G hvars(r) as required. 

□ 

Proposition 66 Let p ^ P and {x t) & Bind, where {x} U vars(t) C VI. 
Let a G 7p(j9) fl VSubst and suppose that x G hvars((j) and vars(t) C hvars(a). 
Furthermore, suppose that or_linp(x, t) and share_linp(a;, t) hold. Then, for all 
substitutions r G mgs (^cr U {x = t}^, we have 

hvars((7) \ share_same_varp(x, t) C hvars(T). (B.50) 



PROOF. If o" U {x = t} is not satisfiable, the result is trivial. We therefore 
assume, for the rest of the proof, that a U {x = t} is satisfiable in TZT. It 
follows from Corollary 60 that we just have to show that there exists r G 
mgs(a U {x = t}^ such that (B.50) holds. 

As a; G hvars((7) and vars(t) C hvars(a"), by using Proposition 55 we obtain 
rt(x, a) — xa and rt(i, a) — ta. Also 

vais{xa) n dom((T) = 0, vaYs{ta) fl dom((j) = 0. (B.51) 
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By hypothesis, or_Unp(a;, t) holds so that, by Definition 8, for some r G {x. t}, 
ra is hnear. Also by hypothesis, share_linp(a;, t) holds so that, by Definition 8, 
if r' — {x,t} \ {r}, for all z e vars(r(T) Pi vars(r'(T), occ Jin(2;, r'cr) holds. 
Therefore, 

vars(r(7) fl nlvars(r'(j) = 0. (B.52) 

By Lemma 41 and the congruence axioms, (tU{x = t} {ra = r'a}. Thus, 

as a U {x = t} is satisfiable in TZT, mgs(ro" = r'a) ^ 0. Thus, as ra is linear 
and (B.52) holds, we can apply Lemma 61 (where s = ra and i = r'a) so that 
there exists e mgs(a;a" = ta) such that, for all w e dom(/i) \ (vaTs{xa) fl 

vars(ta)^ , 

vars(t(;/i) fl dom(//) = 0. (B.53) 

Note that, by (B.51), 

dom((7) n vars(//) = 0. (B.54) 



Let 



=^ I 2; = za/j, z e dom((T) |, 



def . . 
T = 1/ U /X. 

Then, as cr, // e RSubst, it follows from (B.54) that v,t & Eqs have no identi- 
ties or circular subsets so that z/, r e RSubst. By Lemma 41, r e m.gs{a\J{x — 

0). 

Suppose y e hvars(a) \ share_same_varp(a;, t). We show that y e hvars(T). As 
y e hvars((T), using Proposition 55, rt(y, cr) = ya and 

vars(7/(T) n dom((T) = 0. (B.55) 

As y ^ share_same_varp(2;, i), by Definition 8, 

vars(y(T) n ya.Ys{xa) n vars(t(j) = 0. (B.56) 

Therefore, using (B.56) if y ^ dom{a) and (B.51) if y G dom((T), it follows 
that 

y ^ vars(xcr) fl vars(tcr). (B.57) 

We show that vars(yr) fl dom(r) = 0. Now, if y ^ dom(r), the result holds 
trivially. Suppose that y e dom(i/), then yr — ya/i. Let w be any variable 
in vaTs{ya). Then, by (B.56), w ^ (va,Ts(xa) r\vais{ta)^ and, by (B.55), w ^ 
dom(o'). If w ^ dom(/x), then = wfi ^ dom(r). If -u; G dom(^), then 
vais{wfi) C vars(/i) so that, by (B.54), we also have vars(w/i) fl dom(z/) = 0. 
Moreover (B.53) applies so that vars(-u;/x)ndom(/i,) = 0. Therefore, vars(w/i)n 
dom(T) = 0. It follows that vars(yi/) fl dom(T) = 0. Finally, suppose y e 
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dom(yu). Then yr = y/j, and, by (B.54), vars(|//i) n dom(z/) = 0. As (B.57) 
holds, (B.53) apphes where w is replaced by y so that vars(|//x) ndom(/i) = 0. 
Thus vars(|//x) n dom(r) = 0. 

Therefore, using Definition 12, we have that y e hvars(T) as required. □ 

Proposition 67 Letp E P and (x i— > t) G Bind, where {a;}Uvars(t) C VI. Let 
also a G 7p(p) H VSubst and suppose that {r,r'} = {x.t}, vars(r) C hvars((7) 
and linp(r) holds. Then, for all r G mgs(^cr U {x = t}^, we have 

hvars((T) \ share_withp(r) C hvars(T). (B.58) 

PROOF. If o" U {a; = t} is not satisfiable, the result is trivial. We therefore 
assume, for the rest of the proof, that a U {x = t} is satisfiable in TZT. It 
follows from Corollary 60 that we just have to show that there exists r G 
mgs(cr U {x — t}^ such that (B.58) holds. 

By hypothesis, vars(r) C hvaxs{a). Hence, by Proposition 55, rt(r, cr) = ra 
and 

vars(rcr) fl dom^a) = 0. (B.59) 
By hypothesis, linp(r) holds, so that, by Definition 8, ra is linear. 

Let 

{ui, . . . , Ui} =^ dom(cr) fl (vais{xa) U vars(icr)j, 

_ def / \ 

s = {ui,...,ui,ra), 

T def / / \ 

t = [uia, . . .,uia,ra). 

Since ra is linear, it follows from (B.59) that s is linear. By Lemma 41 and the 
congruence axioms, aU {x = t} =4> s = i. Thus, as aU {x = t} is satisfiable 
in TZT, we have mgs(s = t) 0. Therefore, we can apply Lemma 62 so that 
there exists // G mgs(s = t) such that, for all w G dom(//) \ vars(s), 

vars('u;/i) n dom(/i) = 0. (B.60) 

Note that, since a G VSubst, for each i = 1, . . . , Z, we have 

vaxs{uia) C vars(xcr) U vars(i(T). 

Thus 

vars(//) C vars(xa-) U vars(i(7). (B.61) 
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Let 

z e dom((T) \ ^vars(a;(T) U vars(t(j 



def , 
V — < Z = zajd 



def . . 

r — uU iji. 

Then, as cr, // e RSubst, it follows from (B.61) that i/, r e Eqs have no identi- 
ties or circular subsets so that i/, r e RSubst. By Lemma 41, r e mgs((TU{x = 

«}). 

Suppose J/ G hvars(o') \ share_withp(r). Then we show that y £ hvars(r). As 
y e hvars((T), by Proposition 55, rt(y, a) = ya and 

vaxs{ya) n dom{a) = 0. (B.62) 

As y ^ share_withp(r), by Definition 8, y ^ share^ame_varp(|/, r) so that, 
using the same definition, 

vars(ycr) Pi vars(rcr) = 0. (B.63) 
Therefore using (B.63) if y ^ dom(a) and (B.59) if y G dom{a), it follows that 

y ^ vars(r(7). (B.64) 

We show that vars(|/r) fl dom(r) = 0. Now, if y ^ dom(r), the result holds 
trivially. Suppose that y G dom(i/). Then yr — ya/i and y G dom((j). It follows 

from (B.62) and (B.63) that vars(yo') fl vars(s) = 0. Let -u; be any variable in 
vars(ya) so that w ^ vars(s). By (B.62), we have w ^ dom{a). U w ^ dom(yu), 
then we have w = wfi ^ dom(r). If w G dom(/i), then vars(u;/Li) C vars(/x) 
so that, by (B.61), vars(w/i) fl dom(z/) = 0. Moreover (B.60) applies so that 
vars(u'//) n dom(/i) = 0. Therefore, vars(u'/i) fl dom(T) = 0. It follows that 
vars(yz/) n dom(r) = 0. Finally, suppose y G dom(/i). Then yr = y/x and, by 
(B.61), vars(yyu) fl dom{u) = 0. Since o" G VSubst and y G hvars((T), we have 
y ^ dom(o') n (Yais{ra) U vars(r'o')^ and hence y ^ vars(s). Therefore (B.60) 
applies and vars(|//i) fl dom(/x) = 0. Thus vars(7//x) fl dom(r) = 0. 

Therefore, using Definition 12, we have that y G hvars(T) as required. □ 

Proposition 68 Let p E P and (x t) E Bind, where {x} U vars(t) C VI. 
Let also a G 7p(p) n VSubst. Then, for all r G mgs(a U {x = t}^, 

hvars((T) \ (^share_withp(x) U share_withp(i)^ C hvars(T). (B.65) 



PROOF. U a U {x = t} is not satisfiable, the result is trivial. We therefore 
assume, for the rest of the proof, that a Li {x — t} is satisfiable in TZT. It 
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follows from Corollary 60 that we just have to show that there exists r e 
mgs^cr U {x — t}^ such that (B.65) holds. 

Let 

{ui, . . . , ui} =^ dom((j) n (v8j:s{xa) U vaxs{ta)^, 

- def / \ 

s = {ui,...,ui,xa), 
t =^ (uia, . . . , uia, ta) . 

Note that, since a e VSubst, for each i 1, ...,/, we have 

vars(Mi(7) C vars(xcr) Uvars(tcr). 

Thus, for any /i e mgs(s — t), we have 

vars(/x) C vars(a;cr) Uvars(t(T). (B.66) 



Let 



=^ I 2; = zafj, z e dom((T) \ (yeiTs{xa) U vars(t(7)^ |, 

def I . 

r — i^U iJ,. 

Then, as a, /i G RSubst. it follows from (B.66) that i^, r e Eqs have no identi- 
ties or circular subsets so that u,r E RSubst. Thus, using Lemma 41 and the 
assumption that a U {x = t} is satisfiable in TZT, r e mgs ^cr U {x = t}^ . 

Suppose that y e hvars((j) \ (^share_withp(a;) U share_withp(t)^ . We show that 
y e hvars(T). As y e hvars((T), by Proposition 55, rt(y, a) — ya and 

vars(|/(j) n dom((T) = 0. (B.67) 

As y ^ share_withp(x) U share_withp(i), it follows from Definition 8 that 

y ^ share_same_varp(|/, x) U share_same_varp(|/, t) 

so that, using the same definition with the result that rt{y, a) = ya, we obtain 

vars(|/(T) n (ya.rs{xa) U vars(t(T)^ — 0. (B.68) 

Therefore, using (B.68) if y ^ dom{a) and using the fact that a e VSubst, if 
y e dom((7) , it follows that 

y ^ vars(a;cr) U vars(ta). (B.69) 

We show that vars(j/r) fl dom(r) = 0. Now, if y ^ dom(r), the result holds 
trivially. Suppose that y e dom(T). Then, by (B.66) and (B.69), y ^ vars(/x) so 
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that y ^ dom(/x) and vais{yfi) fl dom(/i) = 0. Thus wc must have y G dom(u) 
and = ya. Then, by (B.66) and (B.68), vais{ya) fl dom(/i) = 0. Moreover, 
by (B.67), vars(i/(T) ndom((T) = 0. It follows that vars(|/(T) n dom(r) = and 
hence, as ya = yr, vars(|/r) n dom(T) — 0. 

Therefore, using Definition 12, we have that y e hvars(T) as required. □ 



Proof of Theorem 19 on page 21. By hypothesis, a G ^p{j))- By Theo- 
rem 50, there exists a' G VSubst such that TZT h V(cr a'). By Proposi- 
tion 59, as (J, u' are satisfiable in TZT, we have that hvars((7) = hvars((j'). By 
Definition 7, cr G 7p(p) if and only if a' G ^p{p)- We therefore safely assume 
that (7 G VSubst. 

By hypothesis, we have a G jH{h). Therefore, it follows from Definition 16 
that h C hvars((T). Similarly, by Definition 16, in order to prove r G 7ff(/i'), 
we just need to show that h' C hvars(r) where h' is as defined in Definition 18. 
There are eight cases that have to be considered. 

(1) hterm/i(a;) A groundp(a;) holds. 

As htermfe(x) holds, by Definition 18, x E h. Hence, by Definition 16, 
we have x G hvars((T). As groundp(a;) holds, by Definition 8, rt(a;, a) G 
GTerms. Therefore we can apply Proposition 63, where r is replaced by 
X and r' by t, to conclude that 

hvars((7) U vars(f) C hvars(T). 

(2) hterm/i(t) A groundp(t) holds. 

As hterm/j(t) holds, by Definition 18, vars(t) C h. Hence, by Def- 
inition 16, vars(i) C hvars(cT). As groundp(t) holds, by Definition 8, 
rt(t, a) G GTerms. Therefore we can apply Proposition 63, where r is 
replaced by t and r' by x, to conclude that 

hvars((T) U {x} C hvars(r). 

(3) hterm^(x) A hterm/i(t) A indp(a;, t) A or Jinp(a;, t) holds. 

As hterm/i(a;) and hterm/i(t) hold, by Definition 18, x E h and vars(t) C 
h. Hence, by Definition 16, x G hvars(o") and vars(t) C hvars(cr). There- 
fore we can apply Proposition 64 to conclude that 

hvars((7) C hvars(T). 

(4) hterm/i(x) A hterm/j(t) A gfreep(x) A gfreep(t) holds. 

As hterm/i(a;) and hterm/i(t) hold, by Definition 18, x E h and vars(t) C 
h. Hence, by Definition 16, x G hvars((j) and vars(i) C hvars((T). There- 
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fore we can apply Proposition 65 to conclude that 

livars((T) C hvars(r). 

(5) literm/j(a;) A htcrm/i(t) A share Jinp(a;, t) A orJinp(a;, t) holds. 

As htcrm/j(,T) and hterm/j(t) hold, by Definition 18, x ^ h and vars(t) C 
h. Hence, by Definition 16, x G hvars((T) and vars(t) C hvars((T). There- 
fore we can apply Proposition 66 to conclude that 

hvars((7) \ share_same_varp(a;, C hvars(T). 

(6) hterm/j(a;) A linp(x) holds. 

As htermfe(a;) holds, by Definition 18, x E h. Hence, by Definition 16, 
we have x e hvars(o'). Therefore we can apply Proposition 67 where r is 
replaced by x and r' by t, to conclude that 

hvars((T) \ share_withp(a;) C hvars(r). 

(7) hterm/,(t) A linp(t) holds. 

As hterm;i(t) holds, by Definition 18, vars(t) C h. Hence, by Defini- 
tion 16, vars(i) C hvars((T). Therefore we can apply Proposition 67 where 
r is replaced by t and r' hy x, to conclude that 

hvars(cr) \ share_withp(i) C hvars(T). 

(8) For all (x i— > e Bind where {x} U vars(t) C VI, Proposition 68 apphes 
so that 

hvars((T) \ (^share_withp(a;) U share_withp(t)^ C hvars(r). 

□ 



Proof of Theorem 21 on page 22. Suppose that r G 3a; . {a}. We need 
to show that r e 7i/(projj:^(/i, x) 



Let V = Vars \ VI. Then, by Definition 5, TZT h y(3V . {r ^ 3x . a)). Thus 
we have 

nr h V(^(3F . r) ^ (3V U {x} . (B.70) 

Suppose V & V \ vars((7). As we assumed that Vars is denumerable and that 
VI is finite, such a v will exist. Moreover, as x G VI, we have x ^ v. Let 
o"' G RSubst be obtained from a by replacing every occurrence of x by v. 
Formally, if p = {a; h- > w}, let 

cr' =^ I y I— > yap y G dom(a) \ {x} | U a", 
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where a" — {v ^ xap} if x e dom{a) and otherwise. Then a' e RSubst 
and 

■RT h V(^(3F . a') ^ (3VU{x} . a)^ 
Thus, by (B.70), TZT h V((3F . r) ^ {3V . a')). Therefore, by Proposition 59, 

hvars(r) n VI = hvars((T') n VI. (B.71) 

As a' e RSubst and x ^ dom((j'), rt(x, cr') — x so that, by Proposition 12, 
X e hvars(o''). Also, as a' is obtained from a by renaming x to the new variable 
V, hvars(cr') D hvars((7) \ {v}. Since v ^ VI, we have 

hvars(cr') fl VI D (hvars(cr) U {a;}) n VI. 

Therefore, by (B.71), 

hvars(T) n VI D (hvars((j) U {x}) n VI. (B.72) 



By hypothesis, a G 7//(/i), so that, by Definition 16, hvars(cr) ^ h. Therefore, 
by (B.72), hvars(T) n VI D (/i U {x}) n VI. Thus, by applying Definition 16, 

we can conclude that r e jnihU {x}). □ 



B.7 Finite-Tree Dependencies 



The proof of Theorem 23 depends on the fact that finite-tree dependencies only 
capture permanent information and that the ^yp function is meet-preserving. 

Proposition 69 Let a,T E RSubst and 4> G Bfun, where a e 7f(0) (ind 
T e ia. Then r G 7f(0). 



PROOF. By the hypothesis, r G | o", so that, for each v E [r, v E I a. 
Therefore, as a G 7f(</'); it follows from Definition 22 that, for all v E [r, 
0(hval( v)] = 1 and hence r G 7f(0). □ 

Proposition 70 Let 0i,02 £ Bfun. Then 

7ir(0i A 02) = 7f(0i) n 7f(02)- 



PROOF. 

7f(0i a 02) = { 0- G RSubst I Vr G i (7 : (01 A ^2) (hval(r)) = 1 } 
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{ (7 e RSubst 
[a e RSubst 



Vr e i (7 : Vi e {1, 2} : (f)i(hvsi\{r)) 
Vre i(7:(/.i(hval(T)) = l} 
n { CT e RSubst I Vr e i a : 02 (hval(T)) = 1 } 
7f(0i) 7^(02)- 



= 1} 



□ 



Proof of Theorem 23 on page 25. Assuming the hypothesis of the theo- 
rem, we will prove each relation separately. 

(23a). Let a = {x ^-^ t} and suppose that r e la. Then, by Proposition 2, 
TZT h y(r — > (t). It follows from Lemma 42 that rt(x, r) = rt(i, r) and thus, 
by Proposition 13, x G hvars(r) if and only if vars(t) C hvars(r). This is 
equivalent to (x ^ Avars(t))(^0 l/hvars(T) ) = 1 and, by Definition 22, to 
(x <-> Avars(i)j (^hval(T)j = 1. As this holds for all r e jcr, by Definition 22, 
a e 7f(x ^ Avars 

(23b). Let (T = H- > t}, where x G vars(t). By Definition 12, x ^ hvaxs{a). By 
case (15a) of Proposition 15, for all r e J, a, we have hvars(r) C hvars(cr). Thus 
X ^ hvars(T) and (-^x) (^hval(T)) = 1. Therefore, by Definition 22, a G ^f(~'x). 

(23c). Let (7 G RSubst such that x G gvars((j) fl hvars(o'). By case (15b) of 
Proposition 15, we have x G hvars(r) for all r G J, cr. So (x) ^hval(T) j = 1. 
Therefore, by Definition 22, a G ■Jf{x)- 

(23d). Let (Ti G El and (J2 G E2. Then, by hypothesis di G 7f(0i) and 
C2 £ 7f(02)- Let T G mgs((Ji U 0-2). By definition of mgs, TZT h V(r — ai) 
and T^T h V(r (72). Thus, by Proposition 2, we have r G fl J,(T2- 
Therefore, by Proposition 69, r G 7f(0i) H 7^(02)- The result then follows by 
Proposition 70. 



(23e). We have 

7^(01 V 02) = { o- e RSubst 
= { (J G RSubst 
D { (J G RSubst 



Vr G ia : (01 V02)(hval(r)) = l} 

Vr G i (J : 3i G {1, 2} . 0i(hval(T)) = 1 } 

VTGi(7:0i(hval(T)) = 1} 

U { (7 G RSubst I Vr G i (7 : 02(hval(T)) = 1 } 

= 7f(0i) U7f(02) 
D El UE2. 
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(23f). Let cr e E and let cr' e 3 a: . {a}. We will show that a' e Jf{^x . 0). 

Let r' G I a'. Then there exists (j[ G RSubst such that TIT h V(^r' ^ (o-'Ufj^)^ . 

Let cTi G 3x . {a[} and let W = (Vars \ VI) U {x}. Then, by Definition 5, 
it follows 7^r h y(3W . {a' ^ a)) and TZT h V(3iy . {a[ ^ en)). As a 
consequence 

nT\-y{3W. {cr'Lia[) ^ 3W . ((jUcti)). 

Therefore a U ai is satisfiable in TZT so that, for some r G RSubst, T^T h 
V(t ^ (ct U ai)). Thus TZT h V(3iy . r ^ . r'). By Proposition 59, 
hvars(r') \ W — hvars(T) \ so that 

(hvars(T') n Vl) U {x} = (hvars(T) n VI) U {x}. (B.73) 

Let c =^ hval(r)(a;). Then, since r G | a and, by hypothesis, a G ^f{4>)^ "we 
have the following chain of implications: 





0(^hval(T)) 


= 1 


[by Defn. 22] 




0(hval(r) [c/x]) 


= 1 


[by Defn. 3] 




0(o[l/hvars(r) n VI 


[c/x] J 


= 1 


[by Defn. 22] 


0(0 


l/(hvars(r) n Vl) U {x} 


[c/x] J 


= 1 


[by Defn. 3] 


0(0 


l/(hvars(r') n Vl) U {x} 


[c/x] ] 


= 1 


[by (B.73)] 




(/)(o[l/hvars(T') n VI 


[c/x]) 


= 1 


[by Defn. 3] 




0(hval(T')[c/x]) 


= 1 


[by Defn. 22] 




0[c/a;](hval(r')) 


= 1. 


[by Defn. 4] 



Prom this last relation, since 0[c/x] ^ 3x . 0, it follows that 

(3x . 0)(hval(T')) = I. 
As this holds for aU r' G i a', by Definition 22, a' G 7f(3x .0). □ 

Proof of Theorem 25 on page 26. Since h C h' ^ by the monotonicity of 
7// we have 7//(/i) 2 Inih'), whence one of the inclusions: jnih) n 7f(0) 2 
7^^(/i') n7F(0). 

In order to establish the other inclusion, we now prove that a G ^ju{h') assum- 
ing a G 'jHih) n 7f(0)- To this end, by Definition 16, it is sufficient to prove 
that h' C hvars((T). 

Let z e h' and let -0 — (0 A A^), so that, by hypothesis, h' — true('0). 
Therefore, we have ijj \^ z. Consider now ijj' — ((f) A Ahvars((7)]. Since a G 
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7if(/i), by Definition 16 we have h C hvars((7), so that ip' \^ ip and thus ip' ^ z. 



Since a G 7f(0), by Definition 22 we have 0(^hval(o')j = 1. Also note that 
(^A hvars(o")^ (^hval(o")^ = 1. From these, by the definition of conjunction for 
Boolean formulas, we obtain ijj' (hval{a)^ = 1. Thus we can observe that 

V''(hval((7)) = 1 <^ (V'' Az)(hval((7)) = 1 
=^ z e hvars((T). 

□ 



Proof of Theorem 27 on page 27. Suppose there exists a e 7ff(/i)n7ir(0). 
By Definition 22, since o" G | o", we have 0(^hval(o')^ = 1; moreover, we have 

(^A hvars((T)^ (^hval((T)^ = 1; therefore, by the definition of conjunction for 
Boolean formulas, we obtain 

(^0A/\/i) (hval((7)) = 1. 

As a consequence, we also have 

hvars((7) n false ^0 ^ A ^) = 

by Definition 16, h C hvars((7), so that we can conclude h fl false(^0 A /\h^ — 
0. □ 



5.5 Relation Between Groundness Dependencies and Finite-Tree Dependen- 
cies 



As was the case for finite-tree dependencies, groundness dependencies only 
capture permanent information. Moreover, the 70 function is meet-preserving. 

Proposition 71 Let a, r G RSubst and ip G Pos, where we have a G 7g('V') 
andr&la. Then t & jg{iP)- 



PROOF. By the hypothesis, r G J, cr, so that, for each v ^ [t, v ^ i a. 
Therefore, as u G 7g(V')) it follows from Definition 28 that, for all v G J,t, 
'0(gval(t')^ = 1 and hence r G 7g(V')- 

Proposition 72 Let '4>i,ip2 € Pos. Then 

7g(V'i a 1P2) = 7g(V'i) n 7g(V'2)- 
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PROOF. 



[ae RSubst I Vr G i a : (V'l A ^2) (gval(r)) = 1 } 



^ I ae RSubst 



Vre ia:V^e {1,2}: 
V'i(gval(r)) = 1 

= { (J e RSubst I Vr G i a : ^1 (gval(r)) = 1 } 

n { (7 e RSubst I Vr e i cr : -02 (gval(T)) = 1 } 

= 7g(V'i) n7G('02)- 



□ 



Since non-ground terms can be made cyclic by instantiating their variables, 
those terms detected as definitely finite on Bfun are also definitely ground. 

Proposition 73 Let x e VI. Then ^f{x) C ^g{x). 



PROOF. Suppose that a e 7f(x). Then, by Definition 22, (x)(hval(T)) = 1 

for all r E la, so that x G hvars(r); in particular, x G hvars(o'). Wc prove 
X G gvars(cr) by contradiction. That is, we show that if a; G hvars(cr)\gvars((7), 
then there exists r G J, cr for which x ^ hvars(T). 

Suppose that x G hvars((T) \ gvars((T). Then, by Propositions 13 and 52, 
rt(a;, a) G HTcrms\GTcrms. Hence, by Proposition 44, there exists i G N such 
that rt(a;, a) = xa* and there exists y G vars(a;(T*)\dom(o'). As wc assumed that 
Sig contains a function symbol of non-zero arity, there exists t G HTerms \ {y} 
for which {y} = vars(t). It follows that a' — {y ^-^ t} E RSubst and, by Defini- 
tion 12, y ^ hvars((T'). Since y ^ dom{a), by Lemma 39, t — aU a' E RSubst. 
Since r G jcr' then, by case (15a) of Proposition 15, we have y ^ hvars(T). 

By Lemma 40, wc have TIT h V(^(T ^ {x = xcr*)^. Thus, since wc also have 

T E la, we obtain TZT h W(t {x = xa') j . By applying Lemma 42, we have 
that rt(a;,T) — Tt{xa\T) and thus, by Proposition 13, we obtain x G hvars(T) 
if and only if vars(a;f7*) C hvars(T). However, as observed before, we know that 
y G vars(x(T*) \ hvars(T), so that we also have x ^ hvars(T). 

Therefore x G gvars(cT) fl hvars((T) and, by case (15b) of Proposition 15, 
for all r G icr, x G gvars(r) n hvars(r). As a consequence, for all r G 
la, (x) (gva\.(T)^ = 1, so that, by Definition 28, we can conclude that a G 
7g(x). □ 



Proof of Theorem 30 on page 28. 
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Proof of (30a). Since ip Aip' \= ip, the inclusion 
follows by the monotonicity of 7g. 

We now prove the reverse inclusion. Let us assume a G 'ynih) r\'-fF{(j)) ("17(9 (■?/'). 
By Proposition 72 wc have that 7g'(V'' A '(/'') = 1g{'^) H 7^ ('*/'')• Therefore it 
is enough to show that a G 7g('?/''). By hypothesis, tp' = pos(3VI \ h . (j)). 
Moreover, by Definition 22, h C hvars((T). Thus, to prove the result, we will 

show, by contradiction, that a G 7Gf pos(^3VI \ hvars((j) 



Suppose therefore that a ^ 7G'^pos(^3VI \ hvars(cr) • 0) j- Then there exists 
r G I o" such that 

pos(3VI \ hvars((7) . 0) (gval(T)) = 0. (B.74) 



Let z G hvars((j) n VI. By Proposition 13, rt(2;,a) G HTcrms. By Proposi- 
tion 44, there exists i G N such that rt(2;, a) = za^ and vars(2;o'*)ndom(cr) = 0. 
Therefore, by Definition 12, vars(2;cr*) C hvars(cr). Thus, we have 

ya,rs{za'') C hvars((T) \ dom{a). (B.75) 

By Lemma 40, as r G I a, IZT h \/{t ^ {z = -zc*)). By Lemma 42, we have 
rt(2;, r) = Yt{za\T) so that, by Proposition 52, 

z G gvars(T) <(=^ vars(2;(T*) C gvars(T). (B.76) 



Take t G GTerms n HTerms and let 



def , 



I y 1-^ t y G (^hvars(cr) fl gvars(T) j \ dom(cr) 



As we assumed that Sig contains a function symbol of non-zero arity, for each 
y G Vars there exists ty G HTerms \ {y} such that vars(tj^) = {y}. Thus let 



V2 



dcf 



y^ty 



y G (yI U vars((T)^ fl hvars((7) 
y ^ gvars(T) U dom((T) 



Note that Vi, V2 G RSubst, vars (t;i)n vars (t;2) = and vars('L'j)ndom(o') = 0, 



for i = 1, 2. Thus, by Lemma 39, r 

7^r. 



/ def 



cr U i;i U 1)2) G RSubst is satisfiable in 
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We now show that 

z e gvars(T) <^=^ z e hvars(r'). (B.77) 

Assume first that z e gvars(r). Then, by (B.76), we have vars(2;(7') C gvars(r). 
Prom this, since also (B.75) holds, we obtain vars(2;(T') C dom(vi) so that, 
by Definitions 9 and 12, vars(2;(T*) C gvars(-L'i) fl hvars(i;i). Since r' G \v\^ 
by case (15b) of Proposition 15, vars(2;o'*) C gvars(r') fl hvars(r'). Thus, by 
Propositions 13 and 52, rt(2;(T*,r') G GTerms fl HTerms. Now r' G | a so 
that, by Lemma 40, 7^T h v(t' ^ {z = ^cr*)). By Lemma 42, Y\,{za\T') = 
rt(^, r') G GTerms fl HTerms so that, by Proposition 13 and Proposition 52, 
z G hvars(T'). 

We prove the other direction by contraposition, assuming that z ^ gvars(r). 
By (B.76), there exists y G vars(2;cr*)\gvars(r). Also note that y G VlUvars((T) 
and, by (B.75), y ^ dom((T) so that y G dom(i;2). By Definition 12, we have y ^ 
hvars(ti2) and, since r' G J,f2, by case (15a) of Proposition 15, y ^ hvars(T'). 
Thus, by Proposition 13, we have that rt(z(T*,r') ^ HTerms. Moreover, as 
VST h \i{t' ^ {^z = zo''-)^. by Lemma 42 we have rt(2;(j\T') = rt(^,r') ^ 
HTerms and therefore, by Proposition 13, z ^ hvars(r'). 

Since z was an arbitrary variable in hvars((7) fl VI, it follows from (B.74) 
and (B.77) that, 

pos(3VI \ hvars((T) . 0) (hval(r')) = 0. (B.78) 

We have by hypothesis that a G 7f(0), so that, as r' G | a, by Definition 22 
we have 0(hval( r')^ = 1. Therefore, since |= pos(^3VI \ hvars(o") . 0^, we 

obtain pos(^3VI \ hvars(cr) . 0^ (^hval(T')) = 1, which contradicts (B.78). 
Proof of (30b). Since A 0' |= 0, the inclusion 

iH{h) n 7f(0) n 7g(V') 2 7//(M n 7f(0 A 0') n 7g(V') 

follows by the monotonicity of 7^. 

We now prove the reverse inclusion. Assume that o G ^nih) 7^(0) n7G('0). 
By Proposition 70 we have that 7f(0 A 0') = 7f(0) H 7f(0')- Therefore it is 
enough to show that a G 7f(0')- By hypothesis, 0' = 3VI \ h . t/j. Moreover, 
by Definition 16, h C hvars((7). Thus, to prove the result, we will show, by 
contradiction, that a G 7ir(^3VI \ hvars(cr) . -0)- 

Suppose therefore that a ^ 7i7^3VI \ hvars(cr) . ijj^. Then there exists t ^ [a 
such that 

(3VI \ hvars((7) . i;) (hval(r)) = 0. (B.79) 
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Take t e GTerms n HTerms and let 

y e vars((T) n (hvars(r) \ dom((T)) j. (B.80) 



dcf , , 



By Lemma 39, r' =^ cr U G RSubst is satisfiable in TZT . 

Let z be any variable in livars((T). By Proposition 13, rt(2;, cr) e HTerms. 

Then, by Proposition 44, there must exists i G N such that rt(z, a) = za^ and 
vars(zo'*) n dom(cr) — 0. Therefore, by Definition 12, vars(2;cr*) C hvars((T). 
Thus, we have 

vars(2;o"*) C hvars(o") \ dom(a"). (B.81) 

By Lemma 40, as r G [a, IZT h V(^r ^ [z = za"^)^. By Lemma 42, we have 
rt(2;,r) = it{za\T) so that, by Proposition 13, 

z G hvars(T) <(=^ vars(2;(7*) C hvars(r). (B.82) 



We now show that 

hvars(T) = hvars((7) fl gvars(T'). (B.83) 

Since r G J, cr, it follows from case (15a) of Proposition 15 that hvars(r) C 
hvars((7). Thus, as 2; G hvars((7), either z G hvars(T) or 2; G hvars((7) \hvars(T). 
We consider these cases separately. 

First, assume that z G hvars(T). Then, by (B.82), vars(2;cr*) C hvars(T). Also, 
by case (15a) of Proposition 15, we have z G hvars((7), so that we can ap- 
ply (B.81) to derive vars(2;cr*) fl dom((T) = 0. Therefore, YBxs{za'^) C dom(i;) 
and, by Definitions 9 and 12, Yaxs{za^) C gvars(i;)nhvars(t;). Since r' G | -u, by 
case (15b) of Proposition 15, we have vars(za*) C gvars(r') nhvars(r'). Thus, 
by Propositions 13 and 52, rt(2;o'*,r') G GTerms fl HTerms. Now r' G | a so 
that, by Lemma 40, we have TZT h \/{t' — >■ (2; = za^)^. Thus, by Lemma 42, 
rt(2;(7\r') = ii{z,T') G GTerms fl HTerms so that, by Propositions 13 and 
52, z G hvars(r') fl gvars(r'). Hence, by case (15a) of Proposition 15, we can 
conclude z G hvars((7) fl gvars(T'). Thus hvars(T) C hvars((T) fl gvars(T'). 

Secondly, assume that z G hvars(o') \ hvars(r). Since z ^ hvars(r), by (B.82), 
there exists y G vars(2;cr*) \ hvars(r). Also, since z G hvars((7), by (B.81), we 
have y G hvars((7) \ dom((7) so that, by Definition 9, we have y ^ gvars((7). 
By (B.80), since y ^ dom((T) U hvars(T), we have y ^ dom(t;) so that y ^ 
gvars(r'). Thus, by Proposition 52, we have rt(2;o'*,r') ^ GTerms. Moreover, 
since we have IZT h \/{t' {z = za^)^, we obtain, by Lemma 42, rt(zo"*, r') = 
rt{z, t') ^ GTerms and thus, by Proposition 52, z ^ gvars(r'). Thus hvars(r) ^ 
hvars((j) fl gvars(T'). 
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It follows from (B.79) and (B.83) that, 

(3VI \ hvars(a) . (gval(r')) = 0. (B.84) 

We have by hypothesis that a e 7g('0)) so that, as r' e jcr, by Definition 28 
we have ■0(gval(T')^ = 1. Therefore, as ip \^ 3VI \ hvars((7) . ■0, 

(3VI \ hvars((7) . (gval(T')) = 1. 
which contradicts (B.84). □ 

Proof of Theorem 31 on page 28. Since ip A /\ true(0) |= ip, the inclusion 

7f(0) n 7g(^) 5 7f(0) n 7g (^^ a /\ true(0) 
follows by the monotonicity of 7^. To prove the inclusion 
7f(0) n 7g(V') ^ 7f(0) n 7g( V' A /\true(0) 



we will show that 7^(0) C 7G^Atrue(0)^. The thesis will thus follow by 
Proposition 72. We have 



7f(0) ^ 7i;'(^/\true(0)j [since |= /\truc(0)] 

= Pi I 7^(2;) X e true(0) I [by Proposition 70] 

— n{ ^g{x) X e true(0) | [by Proposition 73] 

= 7G^/\true(0)^. [by Proposition 72] 



□ 



Part of the proof of Theorem 34 relies on the following lemma. 

Lemma 74 Let h E H and e Bfun be such that jHih) Pi 7f(0) 7^ 0- Then 
(3VI \h.(f))e Pos. 



PROOF. By hypothesis, there exists a e jHih) fl 7f(0) so that, by Defini- 
tions 16 and 22, we have h C hvars((T) and Ahvars((T) |= 3VI \ hvars(cr) . 0. 

Towards a contradiction, suppose that (3 VI \ /i . 0) ^ Pos, i.e., 

(3VI\/i . 0)(1) = 0. 
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Since existential quantification is an extensive operator on Bfun and h C 
hvars((T), we obtain 3VI \ hvars(cr) . 4> |= 3VI \ h . 4>, so that 

(3VI \ hvars(a) . 0)(1) = 0. 

Moreover, since A hvars((7) |= 3VI \ hvars((T) . 0, we have 

'/\hvars((7)^(l) = 0. 
which is a contradiction. Therefore, (3VI \ /i . 0) e Pos □ 



Proof of Theorem 34 on page 31. Let us assume the hypotheses and prove 
each statement in turn. 

Consider first the case where i — 1, which corresponds to the apphcation of 
the abstract disjunction operator. Then, for the finiteness component hi we 
have: 

hi^hnh' 

D true(0 A /\ /i j n truef 0' A /\h' 



D tTue(^(f) A /\{h n h')j D true(^0' A /\{h n h') 
= true(^0 A f\{h n h') V 0' A /\{h n h') 
^tTue(^{(f)V(f)')A/\{hnh') 
= true ^01 A /\hi^. 

For the finite-tree dependencies component 0i, we have: 

= V 0' 

1= (3VI \ /i . ^) V (3VI \ h' . iP') 

h (3VI \{hn h') . ij)y(3Yi \{hn h') . 

= 3VI \{hnh') .ipVip' 
= 3VI \ hi . ^1. 

For the groundness dependencies component ipi we have: 

= t/, V ijy 

h pos(3VI \ /i . 0) V pos(3VI \ h' . (j)') 

= \h .(t))y l\Y\\y Ujvi \ h' . 0') V /\ VI 
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= (3VI \h.(f))V (3VI \ h' . 0') V /\ VI 
= pos((3VI \h.(f))V (3VI \ h' . 0')) 

hpos(^(3Vi\(/in/i') . 0)v(3Vi\(/in/i') . 0' 

= pos(3VI\ (Zin/i') . 0V0') 
= pos(3VI\/ii . 0i). 

Consider now the case where i — 2, which corresponds to the apphcation of 
the abstract projection operator. Then, for the finiteness component /i2 we 
have: 

h2 = hU {x} 

D true(^(f) A /\hj U {x} 

D true(|(3x . 0) A /\ U {x} 

= true(^(3x . 0) A U {x}) 

= true (02 A /\/i2). 



For the finite-tree dependencies component 02 we have: 

02 = 3a; . 

^3x .3Vl\h .ip 
^3Yl\h .3x .t/j 

= 3VI \[hU {x}) .3x.'iIj 

= 3VI \ /l2 . ^2- 

By hypothesis, ^H{h) n7i?(0) 7^ so that we also have 7^/^/1 U{x}) 07^(30; 
0) 7^ 0. Thus, for the groundness dependencies component ^02 we have: 

■02 = 3a; . -0 

1= 3a; . pos(3VI \h.(p) 

= 3x . 3VI \ h . (j) [by Lemma 74] 

= 3VI \ /i . 3x . 

= 3VI \ (/i U {.t}) . 3a; . 

= pos^3VI \(hU {x}) . 3a; . 0^ [by Lemma 74] 

= pos(3VI\/i2 . 02). 

□ 
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